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1 
METHODS FOR THE PRODUCTION OF IPS 
CELLS USING EPSTEIN-BARR (EBV)-BASED 
REPROGRAMMING VECTORS 


This application is a divisional of U.S. application Ser. 
No. 12/478,154, filed on Jun. 4, 2009, which claims priority 
to U.S. Application No. 61/058,858, filed on Jun. 4, 2008 
and U.S. Application No. 61/160,584, filed on Mar. 16, 
2009. The entire text of each of the above referenced 
disclosures is specifically incorporated herein by reference. 


BACKGROUND OF THE INVENTION 


1. Field of the Invention 

The present invention relates generally to the field of 
molecular biology, stem cells and differentiated cells. More 
particularly, it concerns differentiation programming or 
reprogramming of somatic cells and undifferentiated cells. 

2. Description of Related Art 

In general, stem cells are undifferentiated cells which can 
give rise to a succession of mature functional cells. For 
example, a hematopoietic stem cell may give rise to any of 
the different types of terminally differentiated blood cells. 
Embryonic stem (ES) cells are derived from the embryo and 
are pluripotent, thus possessing the capability of developing 
into any organ or tissue type or, at least potentially, into a 
complete embryo. 

Induced pluripotent stem cells, commonly abbreviated as 
iPS cells or iPSCs, are a type of pluripotent stem cell 
artificially derived from a non-pluripotent cell, typically an 
adult somatic cell, by inserting certain genes. Induced pluri- 
potent stem cells are believed to be identical to natural 
pluripotent stem cells, such as embryonic stem cells in many 
respects, such as in terms of the expression of certain stem 
cell genes and proteins, chromatin methylation patterns, 
doubling time, embryoid body formation, teratoma forma- 
tion, viable chimera formation, and potency and differen- 
tiability, but the full extent of their relation to natural 
pluripotent stem cells is still being assessed. 

IPS cells were first produced in 2006 (Takahashi et al., 
2006) from mouse cells and in 2007 from human cells 
(Takahashi et al., 2007; Yu et al, 2007). This has been cited 
as an important advancement in stem cell research, as it may 
allow researchers to obtain pluripotent stem cells, which are 
important in research and potentially have therapeutic uses, 
without the controversial use of embryos. 

However, at this stage in the study of these induced 
pluripotent stem (1PS) cells, researchers are using integrat- 
ing viral plasmids, which insert the genes into the genome 
of target cells, potentially introducing mutations at the 
insertion site. Therefore, there is a need to develop a method 
to induce pluripotent stem cells essentially free of exog- 
enous viral components. 

Due to the significant medical potential of cell therapy 
and tissue transplantation, there also exists an urgent need 
for the production of any desired cell types by altering 
cellular differentiation status of an available cell population. 
Each specialized cell type in an organism expresses a subset 
of all the genes that constitute the genome of that species. 
Each cell type is defined by its particular pattern of regulated 
gene expression. Cell differentiation is thus a transition of a 
cell from one cell type to another and it involves a switch 
from one pattern of gene expression to another. Cellular 
differentiation during development can be understood as the 
result of a gene regulatory network. A regulatory gene and 
its cis-regulatory modules are nodes in a gene regulatory 
network; they receive input and create output elsewhere in 
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the network. The similar mechanisms may also apply to 
dedifferentiation, for example, inducing pluripotency from 
somatic cells as mentioned above, and transdifferentiation, 
specifically referring to transformation of one differentiated 
cell type into another. Transcription factors controlling the 
development choices have been studied to change differen- 
tiation status; however, viral vectors have also been widely 
used. Therefore, there is a need for improved viral free 
differentiation programming methods. 


SUMMARY OF THE INVENTION 


The present invention overcomes a major deficiency in 
the art in providing induced pluripotent stem cells and other 
desired cell types essentially free of exogenous vector 
elements by differentiation programming. In a first embodi- 
ment there is provided a method for producing an induced 
pluripotent stem (iPS) cell population, the method compris- 
ing the steps of: a) obtaining a reprogramming vector, 
element of the vector comprising a replication origin and 
one or more expression cassettes encoding iPS reprogram- 
ming factors; b) introducing the reprogramming vector into 
cells of a population of somatic cells; c) culturing the cells 
to expand the population; d) selecting progeny cells of said 
expanded population, wherein said progeny has one or more 
characteristics of embryonic stem cells; and e) culturing the 
selected progeny cells to provide the iPS cell population, 
wherein one or more of said expression cassettes comprise 
a nucleotide sequence encoding a trans-acting factor that 
binds to the replication origin to replicate an extra-chromo- 
somal template, and/or wherein the somatic cells express 
such a trans-acting factor. In a further aspect, step c or step 
e further comprises culturing until the cells are essentially 
free of the vector elements or comprise an additional selec- 
tion step as described below to facilitate generation of 
vector-free iPS cells. 

In certain aspects, in order to replicate an extra-chromo- 
somal template, one or more of the expression cassettes 
comprise a nucleotide sequence encoding a trans-acting 
factor that binds to the replication origin; alternatively, the 
somatic cells comprise a nucleotide sequence encoding a 
trans-acting factor that binds to the replication origin. 

In exemplary embodiments, the replication origin may be 
a replication origin of a lymphotrophic herpes virus or a 
gammaherpesvirus, an adenovirus, SV40, a bovine papil- 
loma virus, or a yeast, such as a replication origin of a 
lymphotrophic herpes virus or a gammaherpesvirus corre- 
sponding to oriP of EBV. In a further aspect, the lympho- 
trophic herpes virus may be Epstein Barr virus (EBV), 
Kaposi's sarcroma herpes virus (KSHV), Herpes virus sai- 
miri (HS), or Marek's disease virus (MDV ). In a still further 
aspect, the gammaherpesvirus may be Epstein Barr virus 
(EBV) or Kaposi's sarcoma herpes virus (KSHV). 

In certain embodiments, the trans-acting factor may be a 
polypeptide corresponding to, or a derivative, of a wild-type 
protein corresponding to EBNA-1 of EBV, preferably in the 
presence of a replication origin corresponding to OriP of 
EBV. The derivative may have a reduced ability to activate 
transcription from an integrated template as compared to 
wild-type EBNA-1 and thus reduced chances to ectopically 
activate chromosome genes to cause oncogenic transforma- 
tion. Meanwhile, the derivative may activate transcription at 
least 596 that of the corresponding wild-type protein from an 
extra-chromosomal template after the derivative binds the 
replication origin. Such a derivative may have a deletion of 
residues corresponding to residues about 65 to about 89 of 
wild-type EBNA-1 (SEQ ID NO:1 referring to the wild-type 
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EBNA-1 protein sequence, which is encoded by SEQ ID 
NO:2), and/or has a deletion of residues corresponding to 
residues about 90 to about 328 of EBNA-1 (SEQ ID NO:1), 
or may be a derivative with at least 80% amino acid 
sequence identity to residues 1 to about 40 and residues 
about 328 to 641 of EBNA-1 (SEQ ID NO:1). Amino acids 
90-328 of wild-type EBNA-1 comprise a region rich in 
Gly-Ala repeats that should not contribute significantly to 
EBNA-I's function in the present invention and therefore 
this region may be variable in terms ofthe number of repeats 
present (i.e., the region may be deleted all or in part). An 
exemplary derivative of a wild-type EBNA-1 may have a 
sequence of SEQ ID NO:3, which is encoded by SEQ ID 
NO:4. 

In certain further embodiments, the invention involves an 
additional step of selecting progeny cells of the expanded 
population, wherein the progeny is essentially free of the 
vector elements. Because extra-chromosomally replicated 
vectors, such as OriP-based vectors will be lost from cells 
over time, such as during two-week post-transfection and 
iPS cells does not need exogenous reprogramming factors 
after entering a self-maintaining pluripotent state, this 
optional additional selection step may help accelerate gen- 
eration of vector-free pluripotent stem cells. Therefore, the 
additional step may be at a time after the progeny cells enter 
a self-sustaining pluripotent state, such as at least about 10 
days to at least 30 days after the reprogramming vectors are 
introduced into cells. To facilitate the process to generate 
vector element-free iPS cells, the reprogramming vector 
may further comprise a nucleotide sequence encoding a 
negative selection marker, and the additional step selects 
progeny cells of the expanded population by eliminating 
progeny cells comprising the selection marker with a selec- 
tion agent. For example, the selection marker may encode 
herpes simplex virus-thymidine kinase, allowing for appli- 
cation of a selection agent such as gancyclovir to remove 
cells with residual vectors encoding the kinase. In certain 
aspects, the iPS cell population generated by the methods is 
essentially free of the selection marker. An alternative or 
complementary approach is to test the absence of exogenous 
genetic elements in progeny cells, using conventional meth- 
ods, such as RI-PCR, PCR, FISH (Fluorescent in situ 
hybridization), gene array, or hybridization (e.g., Southern 
blot). 

In some embodiments, the iPS cell population generated 
from the above methods may be essentially free of inte- 
grated, reprogramming vector genetic elements, or essen- 
tially free of vector genetic elements. 

In a further aspect, the reprogramming vector may be 
introduced into the cells by liposome transfection, electropo- 
ration, particle bombardment, calcium phosphate, polyca- 
tion, or polyanion or any methods suitable for introducing 
exogenous genetics elements into the cells. 

In still further aspects of the invention, the somatic cells 
may be from mammals, or more specifically, humans. The 
somatic cells may be terminally differentiated cells, or tissue 
stem cells, including, but not limited to, fibroblasts, 
hematopoietic cells, or mesenchymal cells. For example, the 
somatic cells are fibroblasts. The somatic cells may be from 
a tissue cell bank or from a selected human subject, spe- 
cifically, a live human. Genomes from progeny of these 
somatic cells will be considered to be derived from these 
somatic cells of a certain source, such as a selected human 
individual. 

In some further aspects, the progeny cells could be 
selected for one or more embryonic stem cell characteristics, 
such as an undifferentiated morphology, an embryonic stem 
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cell-specific marker or pluripotency or multi-lineage differ- 
entiation potential or any characteristics known in the art. 
Specifically, the progeny cells may be selected for an 
undifferentiated morphology because of its convenience. 
The embryonic stem cell-specific marker could be one or 
more specific markers selected from the group consisting of 
SSEA-3, SSEA-4, Tra-1-60 or Tra-1-81, Tra-2-49/6E, 
GDF3, REX1, FGF4, ESG1, DPPA2, DPPA4, and hTERT. 
This selection step may be employed at more than one time 
points after transfection to ensure that cells are in a pluri- 
potent state and does not return to a differentiated state. 

Furthermore, in certain aspects of the invention, positive 
selection markers are known in the art and may be used in 
the methods and compositions of the invention to improved 
the efficiency of transfection or concentration of transfected 
cells in a time period sufficient for establishment of a 
self-sustaining pluripotent state. For example, in some 
aspects, the reprogramming vector may further comprise a 
positive selection marker such as a nucleotide sequence 
encoding a antibiotic resistance factor (e.g., neomycin or 
hygromycin resistance marker), or a fluorescent or lumines- 
cent protein (e.g., GFP, RFP, CFP, etc.). After the somatic 
cells are introduced with the reprogramming vector, use of 
the positive selection marker may help concentrate cells 
having reprogramming vectors. However this step is 
optional and depends on the transfection efficiency and 
vector loss rate. If transfection efficiency is high (such as 
more than 9096) and the vectors loss is sufficiently slow for 
cells to establish a self-sustaining pluripotent state, this 
positive selection may not be necessary. 

In still further embodiments of the invention, the iPS 
reprogramming factors may comprise at least one member 
from Sox family and at least one member from Oct family, 
specifically, Sox-2 and Oct-4. Sox and Oct are thought to be 
central to the transcriptional regulatory hierarchy that speci- 
fies ES cell identity. Additional factors may increase the 
reprogramming efficiency, such as a set comprising Sox-2, 
Oct-4, Nanog and, optionally, Lin-28; or comprising Sox-2, 
Oct-4, KIf and, optionally, c-Myc. 

In some further aspects of the above methods, step d may 
range from at least 8 days to at least 30 days, or any 
intermediating days of the preceding numbers after the step 
b, for the time period required to establish a self-sustaining 
pluripotent state. 

The skilled artisan will understand that expression cas- 
settes may be operably linked to a transcriptional regulator 
element, such as promoter or enhancer. 

In a further aspect, a reprogramming vector, comprising a 
replication origin and one or more expression cassettes 
encoding a trans-acting factor that binds to the replication 
origin to replicate an extra-chromosomal template; and iPS 
reprogramming factors is also disclosed. The iPS repro- 
gramming factors may comprise Sox and Oct, more specifi- 
cally, Sox-2 and Oct-4, for example, a set which comprises 
Sox-2, Oct-4, Nanog and, optionally, Lin-28; or comprises 
Sox-2, Oct-4, Klf and, optionally, c-Myc. 

In certain aspects of the reprogramming vector, wherein 
the reprogramming vector replicates extra-chromosomally 
and/or lacks the ability to be integrated into a host cell 
genome. In exemplary embodiments, the replication origin 
may be a replication origin of a lymphotrophic herpes virus 
or a gammaherpesvirus, an adenovirus, SV40, a bovine 
papilloma virus, or a yeast, such as a replication origin of a 
lymphotrophic herpes virus or a gammaherpesvirus corre- 
sponding to oriP of EBV. In a further aspect, the lympho- 
trophic herpes virus may be Epstein Ban virus (EBV), 
Kaposi's sarcoma herpes virus (KSHV), Herpes virus sai- 
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miri (HS), or Marek's disease virus (MDV). Epstein Ban 
virus (EBV) and Kaposi”s sarcoma herpes virus (KSHV) are 
also examples of a gammaherpesvirus. 

In certain embodiments of the reprogramming vector, the 
trans-acting factor may be a polypeptide corresponding to, 
or a derivative of a wild-type protein corresponding to 
EBNA-1 of EBV. The derivative may activate transcription 
at least 5% that of the corresponding wild-type protein from 
an extra-chromosomal template after the derivative binds the 
replication origin, and/or have a reduced ability to activate 
transcription from an integrated template as compared to 
wild-type EBNA-1 and thus reduced chances to ectopically 
activate chromosome genes to cause oncogenic transforma- 
tion. 

Example of a derivative may include a derivative which 
lacks sequences present in the wild-type EBNA-1 protein 
that activate transcription from an integrated template, a 
derivative which has a nuclear localization sequence, a 
derivative which has a deletion of residues corresponding to 
residues about 65 to about 89 of EBNA-1 (SEQ ID NO:1), 
and/or has a deletion of residues corresponding to residues 
about 90 to about 328 of EBNA-1 (SEQ ID NO:1), a 
derivative with at least 8096 amino acid sequence identity to 
residues 1 to about 40 and residues about 328 to 641 of 
EBNA-1 (SEQ ID NO:1), or a derivative comprising a first 
nucleotide sequence encoding residues 1 to about 40 of the 
corresponding wild-type EBNA-1 and a second nucleotide 
sequence encoding residues about 328 to 641 of the corre- 
sponding wild-type EBNA-1. 

In a further aspect, an iPS cell population produced 
according to the preceding method is also claimed. In a still 
further aspect, there may be also disclosed an iPS cell 
population that is essentially free of exogenous retroviral 
elements or an iPS cell population essentially free of exog- 
enous viral elements or any exogenous nucleic acid ele- 
ments, such as vector genetic elements; more specifically, 
the cell population may comprise the genome of a selected 
human individual. In a further aspect, an iPS cell population 
may comprise cells whose genome is derived from a termi- 
nally differentiated human cell such as a primary skin cell 
(e.g., a fibroblast) and essentially free of exogenous retro- 
viral elements or any exogenous nucleic acid or vector 
genetic elements. "Essentially free" of exogenous DNA 
elements means that less than 196, 0.596, 0.196, 0.0596 or any 
intermediate percentage of cells of iPS cell population 
comprises exogenous DNA elements. 

In a still further aspect, a differentiated cell, tissue or 
organ, which has been differentiated from the iPS cell 
population as described above may be disclosed. The dif- 
ferentiated cell may comprise a hematopoietic cell, a myo- 
cyte, a neuron, a fibroblast or an epidermal cell; the tissue 
may comprise nerve, bone, gut, epithelium, muscle, carti- 
lage or cardiac tissue; the organ may comprise brain, spinal 
cord, heart, liver, kidney, stomach, intestine or pancreas. In 
certain aspects, the differentiated cell, tissue or organ may be 
used in tissue transplantation, drug screen or developmental 
research to replace embryonic stem cells. 

The viral-free methods can be used for inducing any 
changes in differentiation status of a cell. In certain aspects, 
there is also provided a method of providing a cell popula- 
tion having an altered differentiation status relative to a 
starting cell population and having cells that are essentially 
free of programming vector genetic elements, the method 
comprising the steps of: a) obtaining a starting population of 
cells having a first differentiation status; b) obtaining one or 
more differentiation programming vectors, each vector com- 
prising a replication origin and one or more expression 
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cassettes encoding one or more differentiation programming 
factors that, in combination, can alter the differentiation 
status of the starting cell population to a second differen- 
tiation status, wherein one or more of said expression 
cassettes comprise a nucleotide sequence encoding a trans- 
acting factor that binds to the replication origin to replicate 
an extra-chromosomal template, and/or wherein the cells of 
the starting population express such a trans-acting factor; c) 
introducing the differentiation programming vector(s) into 
cells of the starting cell population; d) culturing the cells to 
effect expression of the one or more reprogramming factors 
such that traits consistent with the second differentiation 
status arise in at least a portion of cultured cells; and e) 
further culturing cells having the traits for a sufficient 
number of generations to provide a cell population that 
comprise cells having the second differentiation status but 
which cells are essentially free of programming vector 
genetic elements. 

In certain aspects, in order to replicate an extra-chromo- 
somal template, one or more of the expression cassettes in 
at least a differentiation programming vector comprise a 
nucleotide sequence encoding a trans-acting factor that 
binds to the replication origin; alternatively, the starting cells 
may comprise a nucleotide sequence encoding a trans-acting 
factor that binds to the replication origin. 

There may be three ways of altering the differentiation 
status in the present invention: dedifferentiation (which may 
be further defined as reprogramming), differentiation, or 
transdifferentiation. 

In a certain aspect of the invention, an example of 
dedifferentiation is induction of pluripotency from somatic 
cells, such as a fibroblast, a keratinocyte, a hematopoietic 
cell (e.g., a lymphocyte), a mesenchymal cell, a liver cell, a 
stomach cell, or a cell. The traits of the cells of the second 
differentiation status can be further defined as one or more 
characteristics of embryonic stem cells. To induce pluripo- 
tency in the differentiation programming methods, the pro- 
gramming factors may be further defined as reprogramming 
factors that comprise Sox and Oct, more specifically, Sox-2 
and Oct-4, optionally in combination with one or more 
additional factors, such as Nanog, Lin-28, Klf, c-Myc or 
Esrrb. The starting cell may also be a less differentiated cell, 
such as a hematopoietic stem cell, a neural stem cell, or a 
mesenchymal stem cell, or corresponding progenitor cells, 
which may express certain programming factors endog- 
enously and may be more easily reprogrammed into pluri- 
potent cells with the need of less factors. For example, 
neural progenitor cells may be reprogrammed into pluripo- 
tent cells in the absence of exogenous Sox-2 expression. The 
method may also include an additional step of differentiating 
ofthe target cell population which is programmed to a more 
pluripotent status based on the steps described above. 

In another aspect, differentiation methods are also 
included, such as inducing a more specified cell fate of a 
pluripotent or a multipotent cell: for example, differentiation 
of an embryonic stem cell or an induced pluripotent stem 
cell into a more differentiated cell, such as a hematopoietic 
progenitor, an endoderm progenitor, a pancreatic progenitor, 
an endothelial progenitor, or a retina progenitor, or even 
further to a terminally differentiated cell, such as a cardio- 
myocyte, a blood cell, a neuron, a hepatocyte, an islet beta 
cell, or a retina cell; or differentiation of a multipotent cell 
like a hematopoietic stem cell, a neural stem cell, or a 
mesenchymal stem cell as well as a hematopoietic progeni- 
tor, an endoderm progenitor, a pancreatic progenitor, or an 
endothelial progenitor. A specific example is that a pluripo- 
tent cell may be differentiated into an endoderm progenitor 
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with the methods using SOX, such as SOX7 or SOX17, as 
the programming factors. Another example is that a 
hematopoietic progenitor may be differentiated into a B 
lymphocyte with the methods using EBF1 as the differen- 
tiation programming factor. 

In a further aspect, the methods of the present invention 
can be also used for transdifferentiation of a differentiated 
cell type to another differentiated cell type. The starting cell 
and the altered cell may both be terminally or particularly 
differentiated, for example, a B lymphocyte may be pro- 
grammed into a macrophage with programming factors such 
as C/EBP (more specifically, C/EBPa and C/EBPB), or an 
exocrine cell may be programmed into a hepatocyte with 
factors such as C/EBPf or into an islet B-cell with factors 
comprising Ngn3 (also known as Neurog3), Pdx1 and Mafa. 

Furthermore, in certain aspects of the invention, the 
methods may also comprise an additional step of selecting 
cells of the cultured cells, which cells are essentially free of 
differentiation programming vector genetic elements, for 
example, by selecting cells of the cultured cells, which cells 
are essentially free of a selection marker comprised in the 
differentiation programming vector, or by directly testing the 
presence of vector genetic elements by methods known in 
the art. For example, the programming vector may comprise 
a selection marker such as a nucleotide sequence encoding 
a antibiotic resistance factor (e.g., neomycin or hygromycin 
resistance marker), a fluorescent or luminescent protein 
(e.g., GFP, RFP, CFP, etc.), or an enzyme (e.g., thymidine 
kinase). The selection for loss of vector genetic elements 
may be at or after a time when the second differentiation 
status has been established. 

Ina further aspect, the differentiation programming vector 
may be introduced into the starting cells by liposome 
transfection, electroporation, particle bombardment, cal- 
cium phosphate, polycation, or polyanion or any methods 
suitable for introducing exogenous genetics elements into 
the cells. The starting cells may be mammalian cells, more 
specifically, human cells. In a further aspect, a cell of the 
second differentiation status and essentially free of vector 
genetic elements produced according to the preceding meth- 
ods is also provided. 

In a still further aspect, there is also provided a differen- 
tiation programming vector, comprising a replication origin 
and one or more expression cassettes encoding a trans-acting 
factor that binds to the replication origin to replicate an 
extra-chromosomal template; and one or more differentia- 
tion programming factors. The differentiation programming 
factors may be selected from the group consisting of Sox 
(e.g., Sox-2, Sox-7, Sox-17), Oct (e.g., Oct-4), Nanog, 
Lin-28, KIf, c-Myc, Esrrb, EBF1, C/EBP (e.g., C/EBPa, 
C/EBPB), Ngn3, Pdx and Mafa. Specific examples of the 
differentiation programming vector backbone may be an 
episomal expression vector, such as pCEP4, pREP4, or 
pEBNA DEST from Invitrogen. In a certain aspect, the 
differentiation programming vector may be further defined 
as reprogramming vector. The reprogramming vector may 
comprise a Sox family member and an Oct family member, 
such as Sox-2 and Oct-4, and may further comprise one or 
more factors, such as Nanog, Lin-28, Klf4, c-Myc, or Essrb. 

In certain aspects of the differentiation programming 
vector, the replication origin may be a replication origin of 
a lymphotrophic herpes virus or a gamma herpesvirus, an 
adenovirus, SV40, a bovine papilloma virus, or a yeast, 
specifically a replication origin of a lymphotrophic herpes 
virus or a gamma herpesvirus corresponding to oriP of EBV. 
In a particular aspect, the lymphotrophic herpes virus may 
be Epstein Barr virus (EBV), Kaposi's sarcoma herpes virus 
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8 
(KSHV), Herpes virus saimiri (HS), or Marek's disease 
virus (MDV). Epstein Barr virus (EBV) and Kaposi's sar- 
coma herpes virus (KSHV) are also examples of a gamma 
herpesvirus. 

In further embodiments of the differentiation program- 
ming vector, the trans-acting factor may be a polypeptide 
corresponding to, or a derivative of a wild-type protein 
corresponding to EBNA-1 of EBV. The derivative may 
activate transcription at least 596 that of the corresponding 
wild-type protein from an extra-chromosomal template after 
the derivative binds the replication origin, and/or have a 
reduced ability to activate transcription from an integrated 
template as compared to wild-type EBNA-1 and thus 
reduced chances to ectopically activate chromosome genes 
to cause oncogenic transformation. Example of a derivative 
may include a derivative which lacks sequences present in 
the wild-type EBNA-1 protein that activate transcription 
from an integrated template, a derivative which has a nuclear 
localization sequence, a derivative which has a deletion of 
residues corresponding to residues about 65 to about 89 of 
EBNA-1, and/or has a deletion of residues corresponding to 
residues about 90 to about 328 of EBNA-1, a derivative with 
at least 80% amino acid sequence identity to residues 1 to 
about 40 and residues about 328 to 641 of EBNA-1, or a 
derivative comprising a first nucleotide sequence encoding 
residues 1 to about 40 of the corresponding wild-type 
EBNA-1 and a second nucleotide sequence encoding resi- 
dues about 328 to 641 of the corresponding wild-type 
EBNA-1. 

Embodiments discussed in the context of methods and/or 
compositions of the invention may be employed with respect 
to any other method or composition described herein. Thus, 
an embodiment pertaining to one method or composition 
may be applied to other methods and compositions of the 
invention as well. 

As used herein the terms “encode” or “encoding” with 
reference to a nucleic acid are used to make the invention 
readily understandable by the skilled artisan however these 
terms may be used interchangeably with “comprise” or 
“comprising” respectively. 

As used herein the specification, *a" or *an" may mean 
one or more. As used herein in the claim(s), when used in 
conjunction with the word “comprising”, the words “a” or 
*an” may mean one or more than one. 

The use of the term “or” in the claims is used to mean 
“and/or” unless explicitly indicated to refer to alternatives 
only or the alternatives are mutually exclusive, although the 
disclosure supports a definition that refers to only alterna- 
tives and “and/or.” As used herein “another” may mean at 
least a second or more. 

Throughout this application, the term “about” is used to 
indicate that a value includes the inherent variation of error 
for the device, the method being employed to determine the 
value, or the variation that exists among the study subjects. 

Other objects, features and advantages of the present 
invention will become apparent from the following detailed 
description. It should be understood, however, that the 
detailed description and the specific examples, while indi- 
cating preferred embodiments of the invention, are given by 
way of illustration only, since various changes and modifi- 
cations within the spirit and scope of the invention will 
become apparent to those skilled in the art from this detailed 
description. 


BRIEF DESCRIPTION OF THE DRAWINGS 


The following drawings form part of the present specifi- 
cation and are included to further demonstrate certain 
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aspects of the present invention. The invention may be better 
understood by reference to one or more of these drawings in 
combination with the detailed description of specific 
embodiments presented herein. 

FIG. 1: The EBV genome and the latent origin of plasmid 
replication (oriP). 

FIG. 2: A domain-based model and partial structure 
representation of EBNA1 

FIG. 3: An illustrative example of a recipient backbone 
plasmid used in the present invention. 

FIG. 4: Examples of cassettes to be integrated into a 
recipient backbone plasmid. Examples of promoters that 
could be used for expression include, but are not limited to, 
PGK, CMV, SV40, and EF la. 

FIG. 5: An illustrative example of a reprogramming 
plasmid encoding Sox-2, Oct-4, Nanog and Lin28 (option- 
ally). 


DESCRIPTION OF ILLUSTRATIVE 
EMBODIMENTS 


I. The Present Invention 


The instant invention overcomes several major problems 
with current reprogramming technologies or differentiation 
programming of various developmental stages of cells, such 
as generating induced pluripotent stem cells that are essen- 
tially free of viral vectors or exogenous elements. In contrast 
to previous methods using integrating viral vectors, these 
methods use extra-chromosomally reprogramming or differ- 
entiation programming vectors, for example, EBV element- 
based plasmids, to transduce reprogramming or differentia- 
tion programming factors into somatic cells or stem cells, 
culture these cells and select progeny cells for one or more 
embryonic stem cell characteristics or cells for traits con- 
sistent with an desired altered differentiation status. The 
extra-chromosomally replicated vectors, like EBV element- 
based vectors, will not be integrated into the host cell 
genome and will be lost over time after a period sufficient to 
induce cells into a pluripotent or a desired cell state. An 
inherent feature of these methods will produce progeny cells 
essentially free of exogenous genetic elements and a nega- 
tive selection may facilitate the process. These methods 
enable isolation of iPS cells or any desired cell types 
essentially free of vector elements by altering differentiation 
status. Thus, the new compositions and methods will enable 
manufacture of vector-free iPS cells or other desired cell 
types for therapeutics without the risk of mutagenesis caused 
by random insertion or persistent expression of viral ele- 
ments in the cells. Further embodiments and advantages of 
the invention are described below. 


II. Definitions 


*Reprogramming” is a process that confers on a cell a 
measurably increased capacity to form progeny of at least 
one new cell type, either in culture or in vivo, than it would 
have under the same conditions without reprogramming. 
More specifically, reprogramming is a process that confers 
on a somatic cell a pluripotent potential. This means that 
after sufficient proliferation, a measurable proportion of 
progeny having phenotypic characteristics of the new cell 
type if essentially no such progeny could form before 
reprogramming; otherwise, the proportion having character- 
istics of the new cell type is measurably more than before 
reprogramming. Under certain conditions, the proportion of 
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progeny with characteristics of the new cell type may be at 
least about 196, 596, 2596 or more in the in order of 
increasing preference. 

“Differentiation programming" is a process that changes 
a cell to form progeny of at least one new cell type with a 
new differentiation status, either in culture or in vivo, than 
it would have under the same conditions without differen- 
tiation reprogramming. This process includes differentia- 
tion, dedifferentiation and transdifferentiation. *Differentia- 
tion" is the process by which a less specialized cell becomes 
a more specialized cell type. “Dedifferentiation” is a cellular 
process in which a partially or terminally differentiated cell 
reverts to an earlier developmental stage, such as pluripo- 
tency or multipotency. “Transdifferentiation” is a process of 
transforming one differentiated cell type into another differ- 
entiated cell type. 

An “origin of replication" (“ori”) or “replication origin” is 
a DNA sequence, e.g., in a lymphotrophic herpes virus, that 
when present in a plasmid in a cell is capable of maintaining 
linked sequences in the plasmid, and/or a site at or near 
where DNA synthesis initiates. An on for EBV includes FR 
sequences (20 imperfect copies of a 30 bp repeat), and 
preferably DS sequences, however, other sites in EBV bind 
EBNA-1, e.g., Rep* sequences can substitute for DS as an 
origin of replication (Kirshmaier and Sugden, 1998). Thus, 
a replication origin of EBV includes FR, DS or Rep* 
sequences or any functionally equivalent sequences through 
nucleic acid modifications or synthetic combination derived 
therefrom. For example, the present invention may also use 
genetically engineered replication origin of EBV, such as by 
insertion or mutation of individual elements, as specifically 
described in Lindner, et. al., 2008. 

A “lymphotrophic” herpes virus is a herpes virus that 
replicates in a lymphoblast (e.g., a human B lymphoblast) or 
other cell types and replicates extra-chromosomally for at 
least a part of its natural life-cycle. After infecting a host, 
these viruses latently infect the host by maintaining the viral 
genome as a plasmid. Herpes simplex virus (HSV) is not a 
*]ymphotrophic" herpes virus. Exemplary lymphotropic her- 
pes viruses include, but are not limited to EBV, Kaposi's 
sarcoma herpes virus (KSHV), Herpes virus saimiri (HS) 
and Marek's disease virus (MDV). 

A “vector” or “construct” (sometimes referred to as gene 
delivery or gene transfer “vehicle”) refers to a macromol- 
ecule or complex of molecules comprising a polynucleotide 
to be delivered to a host cell, either in vitro or in vivo. 

A "plasmid", a common type of a vector, is an extra- 
chromosomal DNA molecule separate from the chromo- 
somal DNA which is capable of replicating independently of 
the chromosomal DNA. In certain cases, it is circular and 
double-stranded. 

A "template" as used herein is a DNA molecule which is 
specifically bound by a wild-type protein of a lymphotrophic 
herpes virus, which wild-type protein corresponds to 
EBNA-I, as a result of the presence in that template of a 
DNA sequence which is bound by the wild-type protein with 
an affinity that is at least 1096 that of the binding of a DNA 
sequence corresponding to oriP of EBV by the wild-type 
protein and from which template transcription is optionally 
initiated and/or enhanced after the protein binds and/or the 
maintenance of which template in a cell is enhanced. An 
"integrated template” is one which is stably maintained in 
the genome of the cell, e.g., integrated into a chromosome of 
that cell. An “extra-chromosomal template" is one which is 
maintained stably maintained in a cell but which is not 
integrated into the chromosome. 
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By "expression construct" or "expression cassette" is 
meant a nucleic acid molecule that is capable of directing 
transcription. Àn expression construct includes, at the least, 
a promoter or a structure functionally equivalent to a pro- 
moter. Additional elements, such as an enhancer, and/or a 
transcription termination signal, may also be included. 

The term “exogenous,” when used in relation to a protein, 
gene, nucleic acid, or polynucleotide in a cell or organism 
refers to a protein, gene, nucleic acid, or polynucleotide 
which has been introduced into the cell or organism by 
artificial or natural means, or in relation a cell refers to a cell 
which was isolated and subsequently introduced to other 
cells or to an organism by artificial or natural means. An 
exogenous nucleic acid may be from a different organism or 
cell, or it may be one or more additional copies of a nucleic 
acid which occurs naturally within the organism or cell. An 
exogenous cell may be from a different organism, or it may 
be from the same organism. By way of a non-limiting 
example, an exogenous nucleic acid is in a chromosomal 
location different from that of natural cells, or is otherwise 
flanked by a different nucleic acid sequence than that found 
in nature. 

The term “corresponds to” is used herein to mean that a 
polynucleotide sequence is homologous (i.e., is identical, 
not strictly evolutionarily related) to all or a portion of a 
reference polynucleotide sequence, or that a polypeptide 
sequence is identical to a reference polypeptide sequence. In 
contradistinction, the term "complementary to" is used 
herein to mean that the complementary sequence is homolo- 
gous to all or a portion of a reference polynucleotide 
sequence. For illustration, the nucleotide sequence 
“TATAC” corresponds to a reference sequence “TATAC” 
and is complementary to a reference sequence “GTATA”. 

A “gene,” “polynucleotide,” “coding region,” “sequence,” 
“segment,” “fragment,” or “transgene” which “encodes” a 
particular protein, is a nucleic acid molecule which is 
transcribed and optionally also translated into a gene prod- 
uct, e.g., a polypeptide, in vitro or in vivo when placed under 
the control of appropriate regulatory sequences. The coding 
region may be present in either a cDNA, genomic DNA, or 
RNA form. When present in a DNA form, the nucleic acid 
molecule may be single-stranded (i.e., the sense strand) or 
double-stranded. The boundaries of a coding region are 
determined by a start codon at the 5' (amino) terminus and 
a translation stop codon at the 3' (carboxy) terminus. A gene 
can include, but is not limited to, cDNA from prokaryotic or 
eukaryotic mRNA, genomic DNA sequences from prokary- 
otic or eukaryotic DNA, and synthetic DNA sequences. A 
transcription termination sequence will usually be located 3' 
to the gene sequence. 

The term “control elements” refers collectively to pro- 
moter regions, polyadenylation signals, transcription termi- 
nation sequences, upstream regulatory domains, origins of 
replication, internal ribosome entry sites (“IRES”), enhanc- 
ers, splice junctions, and the like, which collectively provide 
for the replication, transcription, post-transcriptional pro- 
cessing and translation of a coding sequence in a recipient 
cell. Not all of these control elements need always be present 
so long as the selected coding sequence is capable of being 
replicated, transcribed and translated in an appropriate host 
cell. 

The term “promoter” is used herein in its ordinary sense 
to refer to a nucleotide region comprising a DNA regulatory 
sequence, wherein the regulatory sequence is derived from 
a gene which is capable of binding RNA polymerase and 
initiating transcription of a downstream (3' direction) coding 
sequence. 
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By *enhancer” is meant a nucleic acid sequence that, 
when positioned proximate to a promoter, confers increased 
transcription activity relative to the transcription activity 
resulting from the promoter in the absence of the enhancer 
domain. 

By “operably linked" with reference to nucleic acid 
molecules is meant that two or more nucleic acid molecules 
(e.g., a nucleic acid molecule to be transcribed, a promoter, 
and an enhancer element) are connected in such a way as to 
permit transcription of the nucleic acid molecule. *Operably 
linked" with reference to peptide and/or polypeptide mol- 
ecules is meant that two or more peptide and/or polypeptide 
molecules are connected in such a way as to yield a single 
polypeptide chain, i.e., a fusion polypeptide, having at least 
one property of each peptide and/or polypeptide component 
of the fusion. The fusion polypeptide is preferably chimeric, 
i.e., composed of heterologous molecules. 

*Homology" refers to the percent of identity between two 
polynucleotides or two polypeptides. The correspondence 
between one sequence and to another can be determined by 
techniques known in the art. For example, homology can be 
determined by a direct comparison of the sequence infor- 
mation between two polypeptide molecules by aligning the 
sequence information and using readily available computer 
programs. Alternatively, homology can be determined by 
hybridization of polynucleotides under conditions which 
form stable duplexes between homologous regions, fol- 
lowed by digestion with single strand-specific nuclease(s), 
and size determination ofthe digested fragments. Two DNA, 
or two polypeptide, sequences are “substantially homolo- 
gous" to each other when at least about 8096, preferably at 
least about 9096, and most preferably at least about 9596 of 
the nucleotides, or amino acids, respectively match over a 
defined length of the molecules, as determined using the 
methods above. 

The term “cell” is herein used in its broadest sense in the 
art and refers to a living body which is a structural unit of 
tissue of a multicellular organism, is surrounded by a 
membrane structure which isolates it from the outside, has 
the capability of self replicating, and has genetic information 
and a mechanism for expressing it. Cells used herein may be 
naturally-occurring cells or artificially modified cells (e.g., 
fusion cells, genetically modified cells, etc.). 

As used herein, the term “stem cell" refers to a cell 
capable of self replication and pluripotency. Typically, stem 
cells can regenerate an injured tissue. Stem cells herein may 
be, but are not limited to, embryonic stem (ES) cells or tissue 
stem cells (also called tissue-specific stem cell, or somatic 
stem cell). Any artificially produced cell which can have the 
above-described abilities (e.g., fusion cells, reprogrammed 
cells, or the like used herein) may be a stem cell. 

“Embryonic stem (ES) cells” are pluripotent stem cells 
derived from early embryos. An ES cell was first established 
in 1981, which has also been applied to production of 
knockout mice since 1989. In 1998, a human ES cell was 
established, which is currently becoming available for 
regenerative medicine. 

Unlike ES cells, tissue stem cells have a limited differ- 
entiation potential. Tissue stem cells are present at particular 
locations in tissues and have an undifferentiated intracellular 
structure. Therefore, the pluripotency of tissue stem cells is 
typically low. Tissue stem cells have a higher nucleus/ 
cytoplasm ratio and have few intracellular organelles. Most 
tissue stem cells have low pluripotency, a long cell cycle, 
and proliferative ability beyond the life of the individual. 
Tissue stem cells are separated into categories, based on the 
sites from which the cells are derived, such as the dermal 
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system, the digestive system, the bone marrow system, the 
nervous system, and the like. Tissue stem cells in the dermal 
system include epidermal stem cells, hair follicle stem cells, 
and the like. Tissue stem cells in the digestive system 
include pancreatic (common) stem cells, liver stem cells, 
and the like. Tissue stem cells in the bone marrow system 
include hematopoietic stem cells, mesenchymal stem cells, 
and the like. Tissue stem cells in the nervous system include 
neural stem cells, retinal stem cells, and the like. 

*Induced pluripotent stem cells,” commonly abbreviated 
as iPS cells or iPSCs, refer to a type of pluripotent stem cell 
artificially prepared from a non-pluripotent cell, typically an 
adult somatic cell, or terminally differentiated cell, such as 
fibroblast, a hematopoietic cell, a myocyte, a neuron, an 
epidermal cell, or the like, by inserting certain genes, 
referred to as reprogramming factors. 

“Pluripotency” refers to a stem cell that has the potential 
to differentiate into all cells constituting one or more tissues 
or organs, or preferably, any of the three germ layers: 
endoderm (interior stomach lining, gastrointestinal tract, the 
lungs), mesoderm (muscle, bone, blood, urogenital), or 
ectoderm (epidermal tissues and nervous system). “Pluripo- 
tent stem cells" used herein refer to cells that can differen- 
tiate into cells derived from any ofthe three germ layers, for 
example, direct descendants of totipotent cells or induced 
pluripotent cells. 

As used herein “totipotent stem cells" refers to cells has 
the ability to differentiate into all cells constituting an 
organism, such as cells that are produced from the fusion of 
an egg and sperm cell. Cells produced by the first few 
divisions of the fertilized egg are also totipotent. These cells 
can differentiate into embryonic and extraembryonic cell 
types. Pluripotent stem cells can give rise to any fetal or 
adult cell type. However, alone they cannot develop into a 
fetal or adult animal because they lack the potential to 
contribute to extraembryonic tissue, such as the placenta. 

In contrast, many progenitor cells are multipotent, i.e., 
they are capable of differentiating into a limited number of 
cell fates. Multipotent progenitor cells can give rise to 
several other cell types, but those types are limited in 
number. An example of a multipotent stem cell is a 
hematopoietic cell—a blood stem cell that can develop into 
several types of blood cells, but cannot develop into brain 
cells or other types of cells. At the end of the long series of 
cell divisions that form the embryo are cells that are termi- 
nally differentiated, or that are considered to be permanently 
committed to a specific function. 

“Self-renewal” refers to the ability to go through numer- 
ous cycles of cell division while maintaining the undiffer- 
entiated state. 

As used herein, the term “somatic cell" refers to any cell 
other than germ cells, such as an egg, a sperm, or the like, 
which does not directly transfer its DNA to the next gen- 
eration. Typically, somatic cells have limited or no pluripo- 
tency. Somatic cells used herein may be naturally-occurring 
or genetically modified. 

Cells are "substantially free" of exogenous genetic ele- 
ments, as used herein, when they have less that 1096 of the 
element(s), and are “essentially free” of exogenous genetic 
elements when they have less than 1% of the element(s). 
However, even more desirable are cell populations wherein 
less than 0.5% or less than 0.1% of the total cell population 
comprise exogenous genetic elements. Thus, iPS cell popu- 
lations wherein less than 0.1% to 10% (including all inter- 
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mediate percentages) of the cells of the population com- 
prises undesirable exogenous genetic elements. 


III. General Background for Induced Pluripotent 
Stem Cells 


In certain embodiments of the invention, there are dis- 
closed methods of reprogramming somatic cells by intro- 
ducing reprogramming factors into somatic cells with an 
extra-chromosomal vector-based system. The progeny of 
these cells could be identical to embryonic stem cells in 
various aspects as described below, but essentially free of 
exogenous genetic elements. Understanding of embryonic 
stem cell characteristics could help select induced pluripo- 
tent stem cells. Reprogramming factors known from stem 
cell reprogramming studies could be used for these novel 
methods. It is further contemplated that these induced pluri- 
potent stem cells could be potentially used to replace embry- 
onic stem cells for therapeutics and research applications 
due to the ethics hurdle to use the latter. 

A. Stem Cells 

Stem cells are cells found in most, if not all, multi-cellular 
organisms. They are characterized by the ability to renew 
themselves through mitotic cell division and differentiating 
into a diverse range of specialized cell types. The two broad 
types of mammalian stem cells are: embryonic stem cells 
that are found in blastocysts, and adult stem cells that are 
found in adult tissues. In a developing embryo, stem cells 
can differentiate into all of the specialized embryonic tis- 
sues. In adult organisms, stem cells and progenitor cells act 
as a repair system for the body, replenishing specialized 
cells, but also maintain the normal turnover of regenerative 
organs, such as blood, skin or intestinal tissues. 

As stem cells can be grown and transformed into special- 
ized cells with characteristics consistent with cells of various 
tissues such as muscles or nerves through cell culture, their 
use in medical therapies has been proposed. In particular, 
embryonic cell lines, autologous embryonic stem cells gen- 
erated through therapeutic cloning, and highly plastic adult 
stem cells from the umbilical cord blood or bone marrow are 
touted as promising candidates. Most recently, the repro- 
gramming of adult cells into induced pluripotent stem cells 
has enormous potential for replacing embryonic stem cells. 

B. Embryonic Stem Cells 

Embryonic stem cell lines (ES cell lines) are cultures of 
cells derived from the epiblast tissue of the inner cell mass 
(ICM) of a blastocyst or earlier morula stage embryos. A 
blastocyst is an early stage embryo—approximately four to 
five days old in humans and consisting of 50-150 cells. ES 
cells are pluripotent and give rise during development to all 
derivatives of the three primary germ layers: ectoderm, 
endoderm and mesoderm. In other words, they can develop 
into each of the more than 200 cell types of the adult body 
when given sufficient and necessary stimulation for a spe- 
cific cell type. They do not contribute to the extra-embryonic 
membranes or the placenta. 

Nearly all research to date has taken place using mouse 
embryonic stem cells (mES) or human embryonic stem cells 
(hES). Both have the essential stem cell characteristics, yet 
they require very different environments in order to maintain 
an undifferentiated state. Mouse ES cells may be grown on 
a layer of gelatin and require the presence of Leukemia 
Inhibitory Factor (LIF). Human ES cells could be grown on 
a feeder layer of mouse embryonic fibroblasts (MEFs) and 
often require the presence of basic Fibroblast Growth Factor 
(bFGF or FGF-2). Without optimal culture conditions or 
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genetic manipulation (Chambers et al., 2003), embryonic 
stem cells will rapidly differentiate. 

Å human embryonic stem cell may be also defined by the 
presence of several transcription factors and cell surface 
proteins. The transcription factors Oct-4, Nanog, and Sox-2 
form the core regulatory network that ensures the suppres- 
sion of genes that lead to differentiation and the maintenance 
of pluripotency (Boyer et al., 2005). The cell surface anti- 
gens most commonly used to identify hES cells include the 
glycolipids SSEA3 and SSEA4 and the keratan sulfate 
antigens Tra-1-60 and Tra-1-81. 

After twenty years of research, there are no approved 
treatments or human trials using embryonic stem cells. ES 
cells, being pluripotent cells, require specific signals for 
correct differentiation—if injected directly into the body, ES 
cells will differentiate into many different types of cells, 
causing a teratoma. Differentiating ES cells into usable cells 
while avoiding transplant rejection are just a few of the 
hurdles that embryonic stem cell researchers still face. Many 
nations currently have moratoria on either ES cell research 
or the production of new ES cell lines. Because of their 
combined abilities of unlimited expansion and pluripotency, 
embryonic stem cells remain a theoretically potential source 
for regenerative medicine and tissue replacement after injury 
or disease. However, one way to circumvent these issues is 
to induce pluripotent status in somatic cells by direct repro- 
gramming. 

C. Reprogramming Factors 

The generation of iPS cells is crucial on the genes used for 
the induction. The following factors or combination thereof 
could be used in the vector system disclosed in the present 
invention. In certain aspects, nucleic acids encoding Sox and 
Oct (preferably Oct3/4) will be included into the reprogram- 
ming vector. For example, a reprogramming vector may 
comprise expression cassettes encoding Sox-2, Oct-4, 
Nanog and optionally Lin-28, or expression cassettes encod- 
ing Sox-2, Oct-4, K1f4 and optionally c-myc, or expression 
cassettes encoding Sox-2, Oct-4, and optionally Esrrb. 
Nucleic acids encoding these reprogramming factors may be 
comprised in the same expression cassette, different expres- 
sion cassettes, the same reprogramming vector, or different 
reprogramming vectors. 

Oct-3/4 and certain members of the Sox gene family 
(Sox-1, Sox-2, Sox-3, and Sox-15) have been identified as 
crucial transcriptional regulators involved in the induction 
process whose absence makes induction impossible. Addi- 
tional genes, however, including certain members of the KIf 
family (Klf-1, KIf2, Klf4, and KIf5), the Myc family 
(C-myc, L-myc, and N-myc), Nanog, and LIN28, have been 
identified to increase the induction efficiency. 

Oct-3/4 (PouSfl) is one of the family of octamer (“Oct”) 
transcription factors, and plays a crucial role in maintaining 
pluripotency. The absence of Oct-3/4 in Oct-3/4+ cells, such 
as blastomeres and embryonic stem cells, leads to sponta- 
neous trophoblast differentiation, and presence of Oct-3/4 
thus gives rise to the pluripotency and differentiation poten- 
tial of embryonic stem cells. Various other genes in the 
“Oct” family, including Oct-3/4's close relatives, Octl and 
Oct6, fail to elicit induction, thus demonstrating the exclu- 
siveness of Oct-3/4 to the induction process. 

The Sox family of genes is associated with maintaining 
pluripotency similar to Oct-3/4, although it is associated 
with multipotent and unipotent stem cells in contrast with 
Oct-3/4, which is exclusively expressed in pluripotent stem 
cells. While Sox-2 was the initial gene used for induction by 
Yamanaka et al., Jaenisch et al., and Thompson et al., other 
genes in the Sox family have been found to work as well in 
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the induction process. Sox1 yields iPS cells with a similar 
efficiency as Sox-2, and genes Sox3, Sox15, and Sox18 also 
generate iPS cells, although with decreased efficiency. 

In embryonic stem cells, Nanog, along with Oct-3/4 and 
Sox-2, is necessary in promoting pluripotency. Therefore, it 
was surprising when Yamanaka et al. reported that Nanog 
was unnecessary for induction although Thomson et al. has 
reported it is possible to generate iPS cells with Nanog as 
one of the factors. 

LIN28 is an mRNA binding protein expressed in embry- 
onic stem cells and embryonic carcinoma cells associated 
with differentiation and proliferation. Thomson et al. dem- 
onstrated it is a factor in iPS generation, although it is 
unnecessary. 

KIf4 of the KIf family of genes was initially identified by 
Yamanaka et al. and confirmed by Jaenisch et al. as a factor 
for the generation of mouse iPS cells and was demonstrated 
by Yamanaka et al. as a factor for generation of human iPS 
cells. However, Thompson et al. reported that Klf4 was 
unnecessary for generation of human iPS cells and in fact 
failed to generate human iPS cells. KIf2 and KIf4 were 
found to be factors capable of generating iPS cells, and 
related genes KIfl and KIf5 did as well, although with 
reduced efficiency. 

The Myc family of genes are proto-oncogenes implicated 
in cancer. Yamanaka et al. and Jaenisch et al. demonstrated 
that c-myc is a factor implicated in the generation of mouse 
iPS cells and Yamanaka et al. demonstrated it was a factor 
implicated in the generation of human iPS cells. However, 
Thomson et al. and Yamanaka et al. reported that c-myc was 
unnecessary for generation of human iPS cells. Usage of the 
*myc” family of genes in induction of iPS cells is troubling 
for the eventuality of iPS cells as clinical therapies, as 2596 
of mice transplanted with c-myc-induced iPS cells devel- 
oped lethal teratomas. N-myc and L-myc have been identi- 
fied to induce in the stead of c-myc with similar efficiency. 

D. Induction of Pluripotent Stem Cells Using Integrating 
Vectors 

IPS cells are typically derived by transfection of certain 
stem cell-associated genes into non-pluripotent cells, such as 
adult fibroblasts. Transfection is typically achieved through 
integrating viral vectors in the current practice, such as 
retroviruses. Transfected genes include the master transcrip- 
tional regulators Oct-3/4 (Pouf51) and Sox-2, although it is 
suggested that other genes enhance the efficiency of induc- 
tion. After a critical period, small numbers of transfected 
cells begin to become morphologically and biochemically 
similar to pluripotent stem cells, and are typically isolated 
through morphological selection, doubling time, or through 
a reporter gene and antibiotic infection. 

In November 2007, a milestone was achieved by creating 
iPS from adult human cells from two independent research 
teams' studies (Yu et al., 2007; Yamanaka et al., 2007). With 
the same principle used earlier in mouse models, Yamanaka 
had successfully transformed human fibroblasts into pluri- 
potent stem cells using the same four pivotal genes: Oct3/4, 
Sox-2, KIf4, and c-Myc with a retroviral system but c-Myc 
is oncogenic. Thomson and colleagues used Oct-4, Sox-2, 
NANOG, and a different gene LIN28 using a lentiviral 
system avoiding the use of c-Myc. 

However, the viral transfection systems used insert the 
genes at random locations in the host's genome; this is a 
concern for potential therapeutic applications of these 
iPSCs, because the created cells might be susceptible to 
cancer. Members of both teams consider it therefore neces- 
sary to develop new delivery methods. 
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On the other hand, forced persistent expression of ectopic 
reprogramming factors may be linked to an elevated fre- 
quency of tumor formation and the final solution to this 
problem will be the generation of transgene-free iPS cells. A 
suite of virally introduced genes may be necessary to start 
the reprogramming process, but gradually the cell’s own 
endogenous pluripotency genes become active, and the viral 
genes will be silenced with a potential to stochastic reacti- 
vated. Recently researchers demonstrate exogenous factors 
may be required for a minimum of about 10-16 days in order 
for cells to enter a self-sustaining pluripotent state (Bram- 
brink et al., 2008; Stadtfeld et al., 2008). The determination 
of the minimum length of transgene expression permit the 
development of non-retroviral delivery methods to derive 
iPS cells, an advantage achieved by the present disclosed 
methods and iPS cells as described below. 


IV. Extra-Chromosomal Vectors for Generating 
Vector-Free Induced Pluripotent Stem Cells and 
Other Cell Types 


As described above, induction of pluripotent stem cells 
from human somatic cells has been achieved using retrovi- 
ruses or lentiviral vectors for ectopic expression of repro- 
gramming genes. Recombinant retroviruses such as the 
Moloney murine leukemia virus have the ability to integrate 
into the host genome in a stable fashion. They contain a 
reverse transcriptase which allows integration into the host 
genome. Lentiviruses are a subclass of Retroviruses. They 
are widely adapted as vectors thanks to their ability to 
integrate into the genome of non-dividing as well as dividing 
cells. These viral vectors also have been widely used in a 
broader context: differentiation programming of cells, 
including dedifferentiation, differentiation, and transdiffer- 
entiation. The viral genome in the form of RNA is reverse- 
transcribed when the virus enters the cell to produce DNA, 
which is then inserted into the genome at a random position 
by the viral integrase enzyme. Therefore, current technology 
of successful reprogramming is dependent on integration- 
based viral approaches. 

However, with the present technology, targeted integra- 
tion is still no routine (Bode et al., 2000b) and the conven- 
tional alternative, random integration, may lead to inser- 
tional mutagenesis with unpredictable consequences in 
induced pluripotent stem cells. For the same reasons expres- 
sion of the transgene can not be controlled since it is 
dependent on the chromatin context of the integration site 
(Baer et al, 2000). High level expression can only be 
achieved at favorable genomic loci but the danger exists that 
integration into highly expressed sites interferes with vital 
cellular functions of induced pluripotent stem cells. 

In addition, there is increasing evidence for the existence 
of cellular defense mechanisms against foreign DNA which 
operate by down-regulating transgenes in a process that is 
accompanied by DNA methylation (Bingham, 1997, Garrick 
et al., 1998). Furthermore, viral components may act along 
with other factors to transform cells. Accompanied by the 
continual expression from a number of viral genes, the 
persistence of at least part ofthe viral genome within the cell 
may cause cell transformation. These genes may interfere 
with a cell's signaling pathway causing the observed phe- 
notypic changes of the cell, leading to a transformed cell 
showing increased cell division, which is favorable to the 
virus. 

Therefore, in certain embodiments, the present invention 
develops methods to generate induced pluripotent stem cells 
and other desired cell types essentially free of exogenous 
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genetic elements, such as from retroviral or lentiviral vector 
used in the previous methods. These methods make use of 
extra-chromosomally replicating vectors, or vectors capable 
of replicating episomally. A number of DNA viruses, such as 
adenoviruses, Simian vacuolating virus 40 (SV40) or bovine 
papilloma virus (BPV), or budding yeast ARS (Autono- 
mously Replicating Sequences)-containing plasmids repli- 
cate extra-chromosomally or episomally in mammalian 
cells. These episomal plasmids are intrinsically free from all 
these disadvantages (Bode et al, 2001) associated with 
integrating vectors but have never been publicly disclosed 
for generating induced pluripotent stem cells. A lymphotro- 
phic herpes virus-based including or Epstein Barr Virus 
(EBV) as defined above may also replicate extra-chromo- 
somally and help deliver reprogramming genes to somatic 
cells. Although the replication origins of these viruses or 
ARS element are well characterized, they have never been 
known for reprogramming differentiated cells to public until 
this disclosure. 

For example, the plasmid-based approach used in the 
invention extracts robust elements necessary for the suc- 
cessful replication and maintenance of an EBV element- 
based system without compromising the system's tractabil- 
ity in a clinical setting as described in detail below. The 
essential EBV elements are OriP and EBNA-1 or their 
variants or functional equivalents. An additional advantage 
of this system is that these exogenous elements will be lost 
with time after being introduced into cells, leading to 
self-sustained iPS cells essentially free of these elements. 

A. Epstein-Barr Virus 

The Epstein-Barr Virus (EBV), also called Human her- 
pesvirus 4 (HHV-4), is a virus of the herpes family (which 
includes Herpes simplex virus and Cytomegalovirus), and is 
one of the most common viruses in humans. EBV maintains 
its genome extra-chromosomally and works in collaboration 
with host cell machinery for efficient replication and main- 
tenance (Lindner and Sugden, 2007), relying solely on two 
essential features for its replication and its retention within 
cells during cell division (Yates et al. 1985; Yates et al. 
1984). One element, commonly referred to as oriP, exists in 
cis and serves as the origin of replication. The other factor, 
EBNA1, functions in trans by binding to sequences within 
oriP to promote replication and maintenance of the plasmid 
DNA. As a non-limiting example, the inventors extract these 
two features and use them in the context of a plasmid to 
shuttle the genes necessary for reprogramming somatic cells 
to facilitate the replication and sustained expression of these 
genes over conventional plasmids. 

B. OriP 

OriP is the site at or near which DNA replication initiates 
and is composed of two cis-acting sequences approximately 
1 kilobase pair apart known as the family of repeats (FR) and 
the dyad symmetry (DS). 

FR is composed of 21 imperfect copies of a 30 bp repeat 
and contains 20 high affinity EBNA1-binding sites (FIG. 1). 
When FR is bound by EBNA1, it both serves as a transcrip- 
tional enhancer of promoters in cis up to 10 kb away 
(Reisman and Sugden, 1986; Yates, 1988; Sugden and 
Warren, 1989; Wysokenski and Yates, 1989; Gahn and 
Sugden, 1995; Kennedy and Sugden, 2003; Altmann et al., 
2006), and contributes to the nuclear retention and faithful 
maintenance of FR containing plasmids (Langle-Rouault et 
al., 1998; Kirchmaier and Sugden, 1995; Wang et al., 2006; 
Nanbo and Sugden, 2007). The efficient partitioning of oriP 
plasmids is also likely attributable to FR. While the virus has 
evolved to maintain 20 EBNA1-binding sites in FR, efficient 
plasmid maintenance requires only seven of these sites, and 
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can be reconstituted by a polymer of three copies of DS, 
having a total of 12 EBNA1-binding sites (Wysokenski and 
Yates, 1989). 

The dyad symmetry element (DS) is sufficient for initia- 
tion of DNA synthesis in the presence of EBNA1 (Aiyar et 
al., 1998; Yates et al., 2000), and initiation occurs either at 
or near DS (Gahn and Schildkraut, 1989; Niller et al., 1995). 
Termination of viral DNA synthesis is thought to occur at 
FR, because when FR is bound by EBNAI it functions as a 
replication fork barrier as observed by 2D gel electropho- 
resis (Gahn and Schildkraut, 1989; Ermakova et al., 1996; 
Wang et al., 2006). Initiation of DNA synthesis from DS is 
licensed to once-per-cell-cycle (Adams, 1987; Yates and 
Guan, 1991), and is regulated by the components of the 
cellular replication system (Chaudhuri et al., 2001; Ritzi et 
al., 2003; Dhar et al., 2001; Schepers et al., 2001; Zhou et 
al., 2005; Julien et al., 2004). DS contains four EBNAI- 
binding sites, albeit with lower affinity than those found in 
FR (Reisman et al., 1985). The topology of DS is such that 
the four binding sites are arranged as two pairs of sites, with 
21 bp center-to-center spacing between each pair and 33 bp 
center-to-center spacing between the two non-paired internal 
binding sites (FIG. 1c) (Baer et al., 1984; Rawlins et al., 
1985). 

The functional roles of the elements within DS have been 
confirmed by studies of another region of EBV's genome, 
termed Rep*, which was identified as an element that can 
substitute for DS inefficiently (Kirchmaier and Sugden, 
1998). Polymerizing Rep* eight times yielded an element as 
efficient as DS in its support of replication (Wang et al., 
2006). Biochemical dissection of Rep* identified a pair of 
EBNA1-binding sites with a 21 bp center-to-center spacing 
critical for its replicative function (ibid). The minimal rep- 
licator of Rep* was found to be the pair of EBNA1-binding 
sites, as replicative function was retained even after all 
flanking sequences in the polymer were replaced with 
sequences derived from lambda phage. Comparisons of DS 
and Rep* have revealed a common mechanism: these rep- 
licators support the initiation of DNA synthesis by recruiting 
the cellular replicative machinery via a pair of appropriately 
spaced sites, bent and bound by EBNA1. 

There are other extra-chromosomal, licensed plasmids 
that replicate in mammalian cells that are unrelated to EBV 
and in some ways appear similar to the zone of initiation 
within the Raji strain of EBV. Hans Lipps and his colleagues 
have developed and studied plasmids that contain “nuclear 
scaffold/matrix attachment regions" (S/MARs) and a robust 
transcriptional unit (Piechaczek et al., 1999; Jenke et al., 
2004). Their S/MAR is derived from the human interferon- 
beta gene, is A/T rich, and operationally defined by its 
association with the nuclear matrix and its preferential 
unwinding at low ionic strength or when embedded in 
supercoiled DNA (Bode et al., 1992). These plasmids rep- 
licate semiconservatively, bind ORC proteins, and support 
the initiation of DNA synthesis effectively randomly 
throughout their DNA (Schaarschmidt et al., 2004). They are 
efficiently maintained in proliferating hamster and human 
cells without drug selection and when introduced into swine 
embryos can support expression of GFP in most tissues of 
fetal animals (Manzini et al., 2006). 

C. EBNA1 

Epstein Barr nuclear antigen 1 (EBNA1) is a DNA- 
binding protein that binds to FR and DS of oriP or Rep* to 
facilitate replication and faithful partitioning of the EBV 
plasmid to daughter cells independent of, but in concert 
with, cell chromosomes during each cell division. 
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The 641 amino acids (AA) of EBNA1 have been catego- 
rized into domains associated with its varied functions by 
mutational and deletional analyses (FIG. 2). Two regions, 
between AA40-89 and AA329-378 are capable of linking 
two DNA elements in cis or in trans when bound by EBNA1, 
and have thus been termed Linking Region 1 and 2 (LR1, 
LR2) (Middleton and Sugden, 1992; Frappier and 
O'Donnell, 1991; Su et al., 1991; Mackey et al., 1995). 
Fusing these domains of EBNA1 to GFP homes the GFP to 
mitotic chromosomes (Marechal et al., 1999; Kanda et al., 
2001). LR1 and LR2 are functionally redundant for repli- 
cation; a deletion of either one yields a derivative of EBNA1 
capable of supporting DNA replication (Mackey and Sug- 
den, 1999; Sears et al., 2004). LR1 and LR2 are rich in 
arginine and glycine residues, and resemble the AT-hook 
motifs that bind A/T rich DNA (Aravind and Landsman, 
1998), (Sears et al., 2004). An in vitro analysis of LR1 and 
LR2 of EBNAI has demonstrated their ability to bind to A/T 
rich DNA (Sears et al., 2004). When LR1, containing one 
such AT-hook, was fused to the DNA-binding and dimeriza- 
tion domain of EBNAI, it was found to be sufficient for 
DNA replication of oriP plasmids, albeit less efficiently than 
the wild-type EBNA1 (ibid). 

LR1 and LR2 do differ, though. The C-terminal half of 
LR1 is composed of amino acids other than the repeated 
Arg-Gly of the N-terminal half, and is termed unique region 
1 (URI). URI is necessary for EBNA1 to activate transcrip- 
tion efficiently from transfected and integrated reporter 
DNAs containing FR (Wu et al., 2002; Kennedy and Sug- 
den, 2003; Altmann et al., 2006). URI is also essential for the 
efficient transformation of B-cells infected by EBV. When a 
derivative of EBNA1 lacking this domain replaces the 
wild-type protein in the context of the whole virus, these 
derivative viruses have 0.196 of the transforming ability of 
the wild-type virus (Altmann et al., 2006). 

LR2 is not required for EBNA1’s support of oriP repli- 
cation (Shire et al., 1999; Mackey and Sugden, 1999; Sears 
et al., 2004). Additionally, the N-terminal half of EBNAI 
can be replaced with cellular proteins containing AT-hook 
motifs, such as HMGA la, and still retain replicative func- 
tion (Hung et al., 2001; Sears et al., 2003; Altmann et al., 
2006). These findings indicate that it likely 1s the AT-hook 
activities of LR1 and LR2 are required for the maintenance 
of oriP in human cells. 

A third of EBNA1’s residues (AA91-328) consist of 
glycine-glycine-alanine (GGA) repeats, implicated in 
EBNAI's ability to evade the host immune response by 
inhibiting proteosomal degradation and presentation (Lev- 
itskaya et al., 1995; Levitskaya et al., 1997). These repeats 
have also been found to inhibit translation of EBNA1 in 
vitro and in vivo (Yin et al., 2003). However, the deletion of 
much of this domain has no apparent effect on functions of 
EBNAI in cell culture, making the role that this domain 
plays difficult to elucidate. 

A nuclear localization signal (NLS) is encoded by 
AA379-386, which also associates with the cellular nuclear 
importation machinery (Kim et al., 1997; Fischer et al., 
1997). Sequences within the Arg-Gly rich regions of LRI 
and LR2 may also function as NLSs due to their highly basic 
content. 

Lastly, the C-terminus (AA458-607) encodes the over- 
lapping DNA-binding and dimerization domains of EBNA1. 
The structure of these domains bound to DNA has been 
solved by X-ray crystallography, and was found to be similar 
to the DNA-binding domain of the E2 protein of papillo- 
maviruses (Hegde et al., 1992; Kim et al., 2000; Bochkarev 
et al., 1996). 
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In specific embodiments of the invention, a reprogram- 
ming vector will contain both oriP and an abbreviated 
sequence encoding a version of EBNA1 competent to sup- 
port plasmid replication and its proper maintenance during 
cell division. The highly repetitive sequence within the 
amino-terminal one-third of wild-type EBNA1 and removal 
of a 25 amino-acid region that has demonstrated toxicity in 
various cells are dispensable for EBNA1’s trans-acting 
function associated with oriP (Yates et al. 1985; Kennedy et 
al. 2003). Therefore, an exemplary derivative, the abbrevi- 
ated form of EBNA1, known as deltaUR1 (the derivative 
with a protein sequence SEQ ID NO:3, which is encoded by 
SEQ ID NO:4), could be used alongside oriP within this 
plasmid-based system. More examples of EBNA1 deriva- 
tives that can activate transcription from an extra-chromo- 
somal template (see, for example, Kirchmaier and Sugden, 
1997, and Kennedy and Sugden, 2003, both incorporated 
herein by reference.) 

A derivative of EBNA-1 used in the invention is a 
polypeptide which, relative to a corresponding wild-type 
polypeptide, has a modified amino acid sequence. The 
modifications include the deletion, insertion or substitution 
of at least one amino acid residue in a region corresponding 
to the unique region (residues about 65 to about 89) of LR1 
(residues about 40 to about 89) in EBNA-1, and may include 
a deletion, insertion and/or substitution of one or more 
amino acid residues in regions corresponding to other resi- 
dues of EBNA-1, e.g., about residue 1 to about residue 40, 
residues about 90 to about 328 (“Gly-Gly-Ala” repeat 
region), residues about 329 to about 377 (LR2), residues 
about 379 to about 386 (NLS), residues about 451 to about 
608 (DNA binding and dimerization), or residues about 609 
to about 641, so long as the resulting derivative has the 
desired properties, e.g., dimerizes and binds DNA contain- 
ing an ori corresponding to oriP, localizes to the nucleus, is 
not cytotoxic, and activates transcription from an extrach- 
romosomal but does not substantially active transcription 
from an integrated template. Substitutions include substitu- 
tions which utilize the D rather than L form, as well as other 
well known amino acid analogs, e.g., unnatural amino acids 
such as a-disubstituted amino acids, N-alkyl amino acids, 
lactic acid, and the like. These analogs include phosphos- 
erine, phosphothreonine, phosphotyrosine, hydroxyproline, 
gamma-carboxyglutamate; hippuric acid, octahydroindole- 
2-carboxylic acid, statine, 1,2,3,4,-tetrahydroisoquinoline-3- 
carboxylic acid,  penicilamine,  ornithine,  citruline, 
a-methyl-alanine, para-benzoyl-phenylalanine, phenylgly- 
cine, propargylglycine, sarcosine, e-N,N,N-trimethyllysine, 
e-N-acetyllysine, N-acetylserine, N-formylmethionine, 
3-methylhistidine, 5-hydroxylysine, .omega.-N-methylargi- 
nine, and other similar amino acids and imino acids and 
tert-butylglycine. 

Conservative amino acid substitutions are preferred—that 
is, for example, aspartic-glutamic as polar acidic amino 
acids; lysine/arginine/histidine as polar basic amino acids; 
leucine/isoleucine/methionine/valine/alanine/glycine/pro- 
line as non-polar or hydrophobic amino acids; serine/threo- 
nine as polar or uncharged hydrophilic amino acids. Con- 
servative amino acid substitution also includes groupings 
based on side chains. For example, a group of amino acids 
having aliphatic side chains is glycine, alanine, valine, 
leucine, and isoleucine; a group of amino acids having 
aliphatic-hydroxyl side chains is serine and threonine; a 
group of amino acids having amide-containing side chains is 
asparagine and glutamine; a group of amino acids having 
aromatic side chains is phenylalanine, tyrosine, and trypto- 
phan; a group of amino acids having basic side chains is 
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lysine, arginine, and histidine; and a group of amino acids 
having sulfur-containing side chains is cysteine and methio- 
nine. For example, it is reasonable to expect that replace- 
ment of a leucine with an isoleucine or valine, an aspartate 
with a glutamate, a threonine with a serine, or a similar 
replacement of an amino acid with a structurally related 
amino acid will not have a major effect on the properties of 
the resulting polypeptide. Whether an amino acid change 
results in a functional polypeptide can readily be determined 
by assaying the specific activity of the polypeptide. 

Amino acid substitutions falling within the scope of the 
invention, are, in general, accomplished by selecting sub- 
stitutions that do not differ significantly in their effect on 
maintaining (a) the structure of the peptide backbone in the 
area of the substitution, (b) the charge or hydrophobicity of 
the molecule at the target site, or (c) the bulk of the side 
chain. Naturally occurring residues are divided into groups 
based on common side-chain properties: 

(1) hydrophobic: norleucine, met, ala, val, leu, ile; 

(2) neutral hydrophilic: cys, ser, thr; 

(3) acidic: asp, glu; 

(4) basic: asn, gln, his, lys, arg; 

(5) residues that influence chain orientation: gly, pro; and 

(6) aromatic; trp, tyr, phe. 

The invention also envisions polypeptides with non- 
conservative substitutions. Non-conservative substitutions 
entail exchanging a member of one of the classes described 
above for another. 

Acid addition salts of the polypeptide or of amino resi- 
dues of the polypeptide may be prepared by contacting the 
polypeptide or amine with one or more equivalents of the 
desired inorganic or organic acid, such as, for example, 
hydrochloric acid. Esters of carboxyl groups of the poly- 
peptides may also be prepared by any of the usual methods 
known in the art. 

Analogs include structures having one or more peptide 
linkages optionally replaced by a linkage selected from the 
group consisting of: —CH,NH—, —-CH,S—, —CH, 
CH,—, —CH—CH-(cis and trans) —CH—CF-trans), 
—COCH2-, —CH(OH)CH,—, and —CH,SO—, by meth- 
ods known in the art and further described in the following 
references: Spatola, 1983; Spatola, 1983; Morley, 1980; 
Hudson et al., 1979 (—CH,NH—, CH,CH,—), Spatola et 
al., 1986 (—CH,—S); Hann, 1982 (—CH—CH— cis and 
trans); Almquist et al., 1980 (—COCH,—); Jennings-White 
et al., 1982 (—COCH,—): Szelke et al. European Appln. EP 
45665 (—CH(OH)CH,—); Holladay et al., 1983 (—C(OH) 
CH,—); and Hruby, 1982 (—CH,S—); each of which is 
incorporated herein by reference. A particularly preferred 
non-peptide linkage is —CH,NH—. Such analogs may have 
greater chemical stability, enhanced pharmacological prop- 
erties (half-life, absorption, potency, efficacy, etc.), altered 
specificity (e.g., a broad-spectrum of biological activities), 
reduced antigenicity, and be economically prepared. 

D. Residue-Free Feature 

Importantly, the replication and maintenance of oriP- 
based plasmids is imperfect and is lost precipitously (25% 
per cell division) from cells within the first two weeks of its 
being introduced into cells; however, those cells that retain 
the plasmid lose it less frequently (3% per cell division) 
(Leight and Sugden, 2001; Nanbo and Sugden, 2007). Once 
selection for cells harboring the plasmid is removed, plas- 
mids will be lost during each cell division until all of them 
have been eliminated over time without leaving a footprint 
of its former existence within the resulting daughter cells. It 
is this footprintless feature that underlies the appeal of the 
oriP-based system as an alternative to the current viral- 
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associated approach to deliver genes to generate iPS cells 
and other desired cells with differentiation programming. 
Other extra-chromosomal vectors will also be lost during 
replication and propagation of host cells and could also be 
employed in the present invention. 


V. Vector Construction and Delivery 


In certain embodiments, reprogramming or differentiation 
programming vectors could be constructed to comprise 
additional elements in addition to nucleic acid sequences 
encoding reprogramming factors or differentiation program- 
ming factors as described above to express these reprogram- 
ming factors in cells. The novel features of these methods 
are use of extra-chromosomally replicating vectors, which 
will not be integrated into the host cell genome and may be 
lost during generations of replication. Details of components 
of these vectors and delivery methods are disclosed below. 

A. Vector 

The use of plasmid- or liposome-based extra-chromo- 
somal vectors, e.g., oriP-based vectors, and/or vectors 
encoding a derivative of EBNA-1 permit large fragments of 
DNA to be introduced to a cell and maintained extra- 
chromosomally, replicated once per cell cycle, partitioned to 
daughter cells efficiently, and elicit substantially no immune 
response. In particular, EBNA-1, the only viral protein 
required for the replication of the oriP-based expression 
vector, does not elicit a cellular immune response because it 
has developed an efficient mechanism to bypass the pro- 
cessing required for presentation of its antigens on MHC 
class I molecules (Levitskaya et al., 1997). Further, EBNA-1 
can act in trans to enhance expression of the cloned gene, 
inducing expression of a cloned gene up to 100-fold in some 
cell lines (Langle-Rouault et al., 1998; Evans et al., 1997). 
Finally, the manufacture of such oriP-based expression 
vectors is inexpensive. 

Other extra-chromosomal vectors include other lympho- 
trophic herpes virus-based vectors. Lymphotrophic herpes 
virus is a herpes virus that replicates in a lymphoblast (e.g., 
a human B lymphoblast) and becomes a plasmid for a part 
of its natural life-cycle. Herpes simplex virus (HSV) is not 
a “lymphotrophic” herpes virus. Exemplary lymphotrophic 
herpes viruses include, but are not limited to EBV, Kaposi’s 
sarcoma herpes virus (KSHV); Herpes virus saimiri (HS) 
and Marek’s disease virus (MDV). Also other sources of 
episome-base vectors are contemplated, such as yeast ARS, 
adenovirus, SV40, or BPV. 

One of skill in the art would be well equipped to construct 
a vector through standard recombinant techniques (see, for 
example, Maniatis et al., 1988 and Ausubel et al., 1994, both 
incorporated herein by reference). 

Vectors can also comprise other components or function- 
alities that further modulate gene delivery and/or gene 
expression, or that otherwise provide beneficial properties to 
the targeted cells. Such other components include, for 
example, components that influence binding or targeting to 
cells (including components that mediate cell-type or tissue- 
specific binding); components that influence uptake of the 
vector nucleic acid by the cell; components that influence 
localization of the polynucleotide within the cell after uptake 
(such as agents mediating nuclear localization); and com- 
ponents that influence expression of the polynucleotide. 

Such components also might include markers, such as 
detectable and/or selection markers that can be used to 
detect or select for cells that have taken up and are express- 
ing the nucleic acid delivered by the vector. Such compo- 
nents can be provided as a natural feature of the vector (such 
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as the use of certain viral vectors which have components or 
functionalities mediating binding and uptake), or vectors can 
be modified to provide such functionalities. A large variety 
of such vectors are known in the art and are generally 
available. When a vector is maintained in a host cell, the 
vector can either be stably replicated by the cells during 
mitosis as an autonomous structure, incorporated within the 
genome of the host cell, or maintained in the host cell’s 
nucleus or cytoplasm. 

B. Regulatory Elements: 

Eukaryotic expression cassettes included in the vectors 
preferably contain (in a 5'-to-3' direction) a eukaryotic 
transcriptional promoter operably linked to a protein-coding 
sequence, splice signals including intervening sequences, 
and a transcriptional termination/polyadenylation sequence. 

i. Promoter/Enhancers 

A “promoter” is a control sequence that is a region of a 
nucleic acid sequence at which initiation and rate of tran- 
scription are controlled. It may contain genetic elements at 
which regulatory proteins and molecules may bind, such as 
RNA polymerase and other transcription factors, to initiate 
the specific transcription a nucleic acid sequence. The 
phrases “operatively positioned,” “operatively linked,” 
“under control,” and “under transcriptional control” mean 
that a promoter is in a correct functional location and/or 
orientation in relation to a nucleic acid sequence to control 
transcriptional initiation and/or expression of that sequence. 

Promoters suitable for use in EBNA-1-encoding vector of 
the invention are those that direct the expression of the 
expression cassettes encoding the EBNA-1 protein to result 
in sufficient steady-state levels of EBNA-1 protein to stably 
maintain EBV oriP-containing vectors. Promoters are also 
used for efficient expression of expression cassettes encod- 
ing reprogramming factors. 

A promoter generally comprises a sequence that functions 
to position the start site for RNA synthesis. The best known 
example of this is the TATA box, but in some promoters 
lacking a TATA box, such as, for example, the promoter for 
the mammalian terminal deoxynucleotidyl transferase gene 
and the promoter for the SV40 late genes, a discrete element 
overlying the start site itself helps to fix the place of 
initiation. Additional promoter elements regulate the fre- 
quency of transcriptional initiation. Typically, these are 
located in the region 30-110 bp upstream of the start site, 
although a number of promoters have been shown to contain 
functional elements downstream of the start site as well. To 
bring a coding sequence “under the control of” a promoter, 
one positions the 5' end of the transcription initiation site of 
the transcriptional reading frame “downstream” of (i.e., 3' 
of) the chosen promoter. The “upstream” promoter stimu- 
lates transcription of the DNA and promotes expression of 
the encoded RNA. 

The spacing between promoter elements frequently is 
flexible, so that promoter function is preserved when ele- 
ments are inverted or moved relative to one another. In the 
tk promoter, the spacing between promoter elements can be 
increased to 50 bp apart before activity begins to decline. 
Depending on the promoter, it appears that individual ele- 
ments can function either cooperatively or independently to 
activate transcription. A promoter may or may not be used 
in conjunction with an “enhancer,” which refers to a cis- 
acting regulatory sequence involved in the transcriptional 
activation of a nucleic acid sequence. 

A promoter may be one naturally associated with a 
nucleic acid sequence, as may be obtained by isolating the 
5' non-coding sequences located upstream of the coding 
segment and/or exon. Such a promoter can be referred to as 
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“endogenous.” Similarly, an enhancer may be one naturally 
associated with a nucleic acid sequence, located either 
downstream or upstream of that sequence. Alternatively, 
certain advantages will be gained by positioning the coding 
nucleic acid segment under the control of a recombinant or 
heterologous promoter, which refers to a promoter that is not 
normally associated with a nucleic acid sequence in its 
natural environment. Å recombinant or heterologous 
enhancer refers also to an enhancer not normally associated 
with a nucleic acid sequence in its natural environment. 
Such promoters or enhancers may include promoters or 
enhancers of other genes, and promoters or enhancers iso- 
lated from any other virus, or prokaryotic or eukaryotic cell, 
and promoters or enhancers not “naturally occurring,” i.e., 
containing different elements of different transcriptional 
regulatory regions, and/or mutations that alter expression. 
For example, promoters that are most commonly used in 
recombinant DNA construction include the P-lactamase 
(penicillinase), lactose and tryptophan (trp) promoter sys- 
tems. In addition to producing nucleic acid sequences of 
promoters and enhancers synthetically, sequences may be 
produced using recombinant cloning and/or nucleic acid 
amplification technology, including PCR™, in connection 
with the compositions disclosed herein (see U.S. Pat. Nos. 
4,683,202 and 5,928,906, each incorporated herein by ref- 
erence). Furthermore, it is contemplated the control 
sequences that direct transcription and/or expression of 
sequences within non-nuclear organelles such as mitochon- 
dria, chloroplasts, and the like, can be employed as well. 

Naturally, it will be important to employ a promoter 
and/or enhancer that effectively directs the expression of the 
DNA segment in the organelle, cell type, tissue, organ, or 
organism chosen for expression. Those of skill in the art of 
molecular biology generally know the use of promoters, 
enhancers, and cell type combinations for protein expres- 
sion, (see, for example Sambrook et al. 1989, incorporated 
herein by reference). The promoters employed may be 
constitutive, tissue-specific, inducible, and/or useful under 
the appropriate conditions to direct high level expression of 
the introduced DNA segment, such as is advantageous in the 
large-scale production of recombinant proteins and/or pep- 
tides. The promoter may be heterologous or endogenous. 

Additionally any promoter/enhancer combination (as per, 
for example, the Eukaryotic Promoter Data Base EPDB, 
through world wide web at epd.isb-sib.ch/) could also be 
used to drive expression. Use of a T3, T7 or SP6 cytoplasmic 
expression system is another possible embodiment. Eukary- 
otic cells can support cytoplasmic transcription from certain 
bacterial promoters if the appropriate bacterial polymerase is 
provided, either as part of the delivery complex or as an 
additional genetic expression construct. 

Non-limiting examples of promoters include early or late 
viral promoters, such as, SV40 early or late promoters, 
cytomegalovirus (CMV) immediate early promoters, Rous 
Sarcoma Virus (RSV) early promoters; eukaryotic cell pro- 
moters, such as, e.g., beta actin promoter (Ng, S. Y., Nuc. 
Acid Res. 17: 601-615, 1989, Quitsche et al., J. Biol. Chem. 
264: 9539-9545, 1989), GADPH promoter (Alexander et al., 
Proc. Nat. Acad. Sci. USA 85: 5092-5096, 1988, Ercolani et 
al., J. Biol. Chem. 263: 15335-15341, 1988), metallothion- 
ein promoter (Karin et al. Cell 36: 371-379, 1989; Richards 
et al., Cell 37: 263-272, 1984); and concatenated response 
element promoters, such as cyclic AMP response element 
promoters (cre), serum response element promoter (sre), 
phorbol ester promoter (TPA) and response element pro- 
moters (tre) near a minimal TATA box. It is also possible to 
use human growth hormone promoter sequences (e.g., the 
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human growth hormone minimal promoter described at 
Genbank, accession no. X05244, nucleotide 283-341) or a 
mouse mammary tumor promoter (available from the ATCC, 
Cat. No. ATCC 45007). Å specific example could be a 
phosphoglycerate kinase (PGK) promoter. 

ii. Initiation Signals and Internal Ribosome Binding Sites 

A specific initiation signal also may be required for 
efficient translation of coding sequences. These signals 
include the ATG initiation codon or adjacent sequences. 
Exogenous translational control signals, including the ATG 
initiation codon, may need to be provided. One of ordinary 
skill in the art would readily be capable of determining this 
and providing the necessary signals. It is well known that the 
initiation codon must be *in-frame” with the reading frame 
of the desired coding sequence to ensure translation of the 
entire insert. The exogenous translational control signals and 
initiation codons can be either natural or synthetic. The 
efficiency of expression may be enhanced by the inclusion of 
appropriate transcription enhancer elements. 

In certain embodiments of the invention, the use of 
internal ribosome entry sites (IRES) elements are used to 
create multigene, or polycistronic, messages. IRES elements 
are able to bypass the ribosome scanning model of 5' 
methylated Cap dependent translation and begin translation 
at internal sites (Pelletier and Sonenberg, 1988). IRES 
elements from two members of the picornavirus family 
(polio and encephalomyocarditis) have been described (Pel- 
letier and Sonenberg, 1988), as well an IRES from a 
mammalian message (Macejak and Sarnow, 1991). IRES 
elements can be linked to heterologous open reading frames. 
Multiple open reading frames can be transcribed together, 
each separated by an IRES, creating polycistronic messages. 
By virtue of the IRES element, each open reading frame is 
accessible to ribosomes for efficient translation. Multiple 
genes can be efficiently expressed using a single promoter/ 
enhancer to transcribe a single message (see U.S. Pat. Nos. 
5,925,565 and 5,935,819, each herein incorporated by ref- 
erence). 

iii. Multiple Cloning Sites 

Vectors can include a multiple cloning site (MCS), which 
is a nucleic acid region that contains multiple restriction 
enzyme sites, any of which can be used in conjunction with 
standard recombinant technology to digest the vector (see, 
for example, Carbonelli et al., 1999, Levenson et al., 1998, 
and Cocea, 1997, incorporated herein by reference.) 
"Restriction enzyme digestion" refers to catalytic cleavage 
of a nucleic acid molecule with an enzyme that functions 
only at specific locations in a nucleic acid molecule. Many 
of these restriction enzymes are commercially available. Use 
of such enzymes is widely understood by those of skill in the 
art. Frequently, a vector is linearized or fragmented using a 
restriction enzyme that cuts within the MCS to enable 
exogenous sequences to be ligated to the vector. “Ligation” 
refers to the process of forming phosphodiester bonds 
between two nucleic acid fragments, which may or may not 
be contiguous with each other. Techniques involving restric- 
tion enzymes and ligation reactions are well known to those 
of skill in the art of recombinant technology. 

iv. Splicing Sites 

Most transcribed eukaryotic RNA molecules will undergo 
RNA splicing to remove introns from the primary tran- 
scripts. Vectors containing genomic eukaryotic sequences 
may require donor and/or acceptor splicing sites to ensure 
proper processing of the transcript for protein expression 
(see, for example, Chandler et al., 1997, herein incorporated 
by reference.) 
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v. Termination Signals 

The vectors or constructs of the present invention will 
generally comprise at least one termination signal. A “ter- 
mination signal” or “terminator” is comprised of the DNA 
sequences involved in specific termination of an RNA 
transcript by an RNA polymerase. Thus, in certain embodi- 
ments a termination signal that ends the production of an 
RNA transcript is contemplated. A terminator may be nec- 
essary in vivo to achieve desirable message levels. 

In eukaryotic systems, the terminator region may also 
comprise specific DNA sequences that permit site-specific 
cleavage of the new transcript so as to expose a polyade- 
nylation site. This signals a specialized endogenous poly- 
merase to add a stretch of about 200 A residues (polyA) to 
the 3' end of the transcript. RNA molecules modified with 
this polyA tail appear to more stable and are translated more 
efficiently. Thus, in other embodiments involving eukary- 
otes, it is preferred that that terminator comprises a signal for 
the cleavage of the RNA, and it is more preferred that the 
terminator signal promotes polyadenylation of the message. 
The terminator and/or polyadenylation site elements can 
serve to enhance message levels and to minimize read 
through from the cassette into other sequences. 

Terminators contemplated for use in the invention include 
any known terminator of transcription described herein or 
known to one of ordinary skill in the art, including but not 
limited to, for example, the termination sequences of genes, 
such as for example the bovine growth hormone terminator 
or viral termination sequences, such as for example the 
SV40 terminator. In certain embodiments, the termination 
signal may be a lack of transcribable or translatable 
sequence, such as due to a sequence truncation. 

vi. Polyadenylation Signals 

In expression, particularly eukaryotic expression, one will 
typically include a polyadenylation signal to effect proper 
polyadenylation of the transcript. The nature of the poly- 
adenylation signal is not believed to be crucial to the 
successful practice of the invention, and any such sequence 
may be employed. Preferred embodiments include the SV40 
polyadenylation signal or the bovine growth hormone poly- 
adenylation signal, convenient and known to function well 
in various target cells. Polyadenylation may increase the 
stability of the transcript or may facilitate cytoplasmic 
transport. 

vii. Origins of Replication 

In order to propagate a vector in a host cell, it may contain 
one or more origins of replication sites (often termed “ori”), 
for example, a nucleic acid sequence corresponding to oriP 
of EBV as described above or a genetically engineered oriP 
with a similar or elevated function in differentiation pro- 
gramming, which is a specific nucleic acid sequence at 
which replication is initiated. Alternatively a replication 
origin of other extra-chromosomally replicating virus as 
described above or an autonomously replicating sequence 
(ARS) can be employed. 

viii. Selection and Screenable Markers 

In certain embodiments of the invention, cells containing 
a nucleic acid construct of the present invention may be 
identified in vitro or in vivo by including a marker in the 
expression vector. Such markers would confer an identifi- 
able change to the cell permitting easy identification of cells 
containing the expression vector. Generally, a selection 
marker is one that confers a property that allows for selec- 
tion. A positive selection marker is one in which the pres- 
ence of the marker allows for its selection, while a negative 
selection marker is one in which its presence prevents its 
selection. Àn example of a positive selection marker is a 
drug resistance marker. 
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Usually the inclusion of a drug selection marker aids in 
the cloning and identification of transformants, for example, 
genes that confer resistance to neomycin, puromycin, hygro- 
mycin, DHFR, GPT, zeocin and histidinol are useful selec- 
tion markers. In addition to markers conferring a phenotype 
that allows for the discrimination of transformants based on 
the implementation of conditions, other types of markers 
including screenable markers such as GFP, whose basis is 
colorimetric analysis, are also contemplated. Alternatively, 
screenable enzymes as negative selection markers such as 
herpes simplex virus thymidine kinase (tk) or chlorampheni- 
col acetyltransferase (CAT) may be utilized. One of skill in 
the art would also know how to employ immunologic 
markers, possibly in conjunction with FACS analysis. The 
marker used is not believed to be important, so long as it is 
capable of being expressed simultaneously with the nucleic 
acid encoding a gene product. Further examples of selection 
and screenable markers are well known to one of skill in the 
art. One feature of the present invention includes using 
selection and screenable markers to select vector-free cells 
after the differentiation programming factors have effected a 
desired altered differentiation status in those cells. 

C. Vector Delivery 

Introduction of a reprogramming or differentiation pro- 
gramming vector into somatic cells with the current inven- 
tion may use any suitable methods for nucleic acid delivery 
for transformation of a cell, as described herein or as would 
be known to one of ordinary skill in the art. Such methods 
include, but are not limited to, direct delivery of DNA such 
as by ex vivo transfection (Wilson et al., 1989, Nabel et al, 
1989), by injection (U.S. Pat. Nos. 5,994,624, 5,981,274, 
5,945,100, 5,780,448, 5,736,524, 5,702,932, 5,656,610, 
5,589,466 and 5,580,859, each incorporated herein by ref- 
erence), including microinjection (Harlan and Weintraub, 
1985; U.S. Pat. No. 5,789,215, incorporated herein by 
reference); by electroporation (U.S. Pat. No. 5,384,253, 
incorporated herein by reference; Tur-Kaspa et al., 1986; 
Potter et al., 1984); by calcium phosphate precipitation 
(Graham and Van Der Eb, 1973; Chen and Okayama, 1987; 
Rippe et al., 1990); by using DEAE-dextran followed by 
polyethylene glycol (Gopal, 1985); by direct sonic loading 
(Fechheimer et al., 1987); by liposome mediated transfec- 
tion (Nicolau and Sene, 1982; Fraley et al., 1979; Nicolau et 
al., 1987; Wong et al., 1980; Kaneda et al., 1989; Kato et al., 
1991) and receptor-mediated transfection (Wu and Wu, 
1987; Wu and Wu, 1988); by microprojectile bombardment 
(PCT Application Nos. WO 94/09699 and 95/06128; U.S. 
Pat. Nos. 5,610,042; 5,322,783 5,563,055, 5,550,318, 5,538, 
877 and 5,538,880, and each incorporated herein by refer- 
ence); by agitation with silicon carbide fibers (Kaeppler et 
al., 1990; U.S. Pat. Nos. 5,302,523 and 5,464,765, each 
incorporated herein by reference); by Agrobacterium-medi- 
ated transformation (U.S. Pat. Nos. 5,591,616 and 5,563, 
055, each incorporated herein by reference); by PEG-medi- 
ated transformation of protoplasts (Omirulleh et al., 1993; 
U.S. Pat. Nos. 4,684,611 and 4,952,500, each incorporated 
herein by reference); by desiccation/inhibition-mediated 
DNA uptake (Potrykus et al., 1985), and any combination of 
such methods. Through the application of techniques such as 
these, organelle(s), cell(s), tissue(s) or organism(s) may be 
stably or transiently transformed. 

i. Liposome-Mediated Transfection 

In a certain embodiment of the invention, a nucleic acid 
may be entrapped in a lipid complex such as, for example, 
a liposome. Liposomes are vesicular structures characterized 
by a phospholipid bilayer membrane and an inner aqueous 
medium. Multilamellar liposomes have multiple lipid layers 
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separated by aqueous medium. They form spontaneously 
when phospholipids are suspended in an excess of aqueous 
solution. The lipid components undergo self-rearrangement 
before the formation of closed structures and entrap water 
and dissolved solutes between the lipid bilayers (Ghosh and 
Bachhawat, 1991). Also contemplated is an nucleic acid 
complexed with Lipofectamine (Gibco BRL) or Superfect 
(Qiagen). The amount of liposomes used may vary upon the 
nature of the liposome as well as the, cell used, for example, 
about 5 to about 20 ug vector DNA per 1 to 10 million of 
cells may be contemplated. 

Liposome-mediated nucleic acid delivery and expression 
of foreign DNA in vitro has been very successful (Nicolau 
and Sene, 1982; Fraley et al., 1979; Nicolau et al., 1987). 
The feasibility of liposome-mediated delivery and expres- 
sion of foreign DNA in cultured chick embryo, HeLa and 
hepatoma cells has also been demonstrated (Wong et al., 
1980). 

In certain embodiments of the invention, a liposome may 
be complexed with a hemagglutinating virus (HVJ). This has 
been shown to facilitate fusion with the cell membrane and 
promote cell entry of liposome-encapsulated DNA (Kaneda 
et al., 1989). In other embodiments, a liposome may be 
complexed or employed in conjunction with nuclear non- 
histone chromosomal proteins (HMG-1) (Kato et al., 1991). 
In yet further embodiments, a liposome may be complexed 
or employed in conjunction with both HVJ and HMG-1. In 
other embodiments, a delivery vehicle may comprise a 
ligand and a liposome. 

ii. Electroporation 

In certain embodiments of the present invention, a nucleic 
acid is introduced into an organelle, a cell, a tissue or an 
organism via electroporation. Electroporation involves the 
exposure of a suspension of cells and DNA to a high-voltage 
electric discharge. Recipient cells can be made more sus- 
ceptible to transformation by mechanical wounding. Also 
the amount of vectors used may vary upon the nature of the 
cells used, for example, about 5 to about 20 ug vector DNA 
per 1 to 10 million of cells may be contemplated. 

Transfection of eukaryotic cells using electroporation has 
been quite successful. Mouse pre-B lymphocytes have been 
transfected with human kappa-immunoglobulin genes (Pot- 
ter et al., 1984), and rat hepatocytes have been transfected 
with the chloramphenicol acetyltransferase gene (Tur-Kaspa 
et al., 1986) in this manner. 

iii. Calcium Phosphate 

In other embodiments of the present invention, a nucleic 
acid is introduced to the cells using calcium phosphate 
precipitation. Human KB cells have been transfected with 
adenovirus 5 DNA (Graham and Van Der Eb, 1973) using 
this technique. Also in this manner, mouse L(A9), mouse 
C127, CHO, CV-1, BHK, NIH3T3 and HeLa cells were 
transfected with a neomycin marker gene (Chen and 
Okayama, 1987), and rat hepatocytes were transfected with 
a variety of marker genes (Rippe et al., 1990). 

iv. DEAE-Dextran 

In another embodiment, a nucleic acid is delivered into a 
cell using DEAE-dextran followed by polyethylene glycol. 
In this manner, reporter plasmids were introduced into 
mouse myeloma and erythroleukemia cells (Gopal, 1985). 

v. Sonication Loading 

Additional embodiments of the present invention include 
the introduction of a nucleic acid by direct sonic loading. 
LTK- fibroblasts have been transfected with the thymidine 
kinase gene by sonication loading (Fechheimer et al., 1987). 
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vi. Receptor Mediated Transfection 

Still further, a nucleic acid may be delivered to a target 
cell via receptor-mediated delivery vehicles. These take 
advantage of the selective uptake of macromolecules by 
receptor-mediated endocytosis that will be occurring in a 
target cell. In view of the cell type-specific distribution of 
various receptors, this delivery method adds another degree 
of specificity to the present invention. 

Certain receptor-mediated gene targeting vehicles com- 
prise a cell receptor-specific ligand and a nucleic acid- 
binding agent. Others comprise a cell receptor-specific 
ligand to which the nucleic acid to be delivered has been 
operatively attached. Several ligands have been used for 
receptor-mediated gene transfer (Wu and Wu, 1987; Wagner 
et al., 1990; Perales et al., 1994; Myers, EPO 0273085), 
which establishes the operability of the technique. Specific 
delivery in the context of another mammalian cell type has 
been described (Wu and Wu, 1993; incorporated herein by 
reference). In certain aspects of the present invention, a 
ligand will be chosen to correspond to a receptor specifically 
expressed on the target cell population. 

In other embodiments, a nucleic acid delivery vehicle 
component of a cell-specific nucleic acid targeting vehicle 
may comprise a specific binding ligand in combination with 
a liposome. The nucleic acid(s) to be delivered are housed 
within the liposome and the specific binding ligand is 
functionally incorporated into the liposome membrane. The 
liposome will thus specifically bind to the receptor(s) of a 
target cell and deliver the contents to a cell. Such systems 
have been shown to be functional using systems in which, 
for example, epidermal growth factor (EGF) is used in the 
receptor-mediated delivery of a nucleic acid to cells that 
exhibit upregulation of the EGF receptor. 

In still further embodiments, the nucleic acid delivery 
vehicle component of a targeted delivery vehicle may be a 
liposome itself, which will preferably comprise one or more 
lipids or glycoproteins that direct cell-specific binding. For 
example, lactosyl-ceramide, a galactose-terminal asialgan- 
glioside, have been incorporated into liposomes and 
Observed an increase in the uptake of the insulin gene by 
hepatocytes (Nicolau et al., 1987). It is contemplated that the 
tissue-specific transforming constructs of the present inven- 
tion can be specifically delivered into a target cell in a 
similar manner. 

vii Microprojectile Bombardment 

Microprojectile bombardment techniques can be used to 
introduce a nucleic acid into at least one, organelle, cell, 
tissue or organism (U.S. Pat. No. 5,550,318; U.S. Pat. No. 
5,538,880; U.S. Pat. No. 5,610,042; and PCT Application 
WO 94/09699; each of which is incorporated herein by 
reference). This method depends on the ability to accelerate 
DNA-coated microprojectiles to a high velocity allowing 
them to pierce cell membranes and enter cells without 
killing them (Klein et al., 1987). There are a wide variety of 
microprojectile bombardment techniques known in the art, 
many of which are applicable to the invention. 

In this microprojectile bombardment, one or more par- 
ticles may be coated with at least one nucleic acid and 
delivered into cells by a propelling force. Several devices for 
accelerating small particles have been developed. One such 
device relies on a high voltage discharge to generate an 
electrical current, which in turn provides the motive force 
(Yang et al., 1990). The microprojectiles used have consisted 
of biologically inert substances such as tungsten or gold 
particles or beads. Exemplary particles include those com- 
prised of tungsten, platinum, and preferably, gold. It is 
contemplated that in some instances DNA precipitation onto 
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metal particles would not be necessary for DNA delivery to 
a recipient cell using microprojectile bombardment. How- 
ever, it is contemplated that particles may contain DNA 
rather than be coated with DNA. DNA-coated particles may 
increase the level of DNA delivery via particle bombard- 
ment but are not, in and of themselves, necessary. 

For the bombardment, cells in suspension are concen- 
trated on filters or solid culture medium. Alternatively, 
immature embryos or other target cells may be arranged on 
solid culture medium. The cells to be bombarded are posi- 
tioned at an appropriate distance below the macroprojectile 
stopping plate. 


VI. Selection of iPS Cells 


In certain aspects of the invention, after a reprogramming 
vector is introduced into somatic cells, cells will be cultured 
for expansion (optionally selected for the presence of vector 
elements like positive selection or screenable marker to 
concentrate transfected cells) and reprogramming vectors 
will express reprogramming factors in these cells and rep- 
licate and partition along with cell division. These expressed 
reprogramming factors will reprogram somatic cell genome 
to establish a self-sustaining pluripotent state, and in the 
meantime or after removal of positive selection of the 
presence of vectors, exogenous genetic elements will be lost 
gradually. These induced pluripotent stem cells could be 
selected from progeny derived from these somatic cells 
based on embryonic stem cell characteristics because they 
are expected to be substantially identical to pluripotent 
embryonic stem cells. An additional negative selection step 
could be also employed to accelerate or help selection of iPS 
cells essentially free of exogenous genetic elements by 
testing the absence of reprogramming vector DNA or using 
selection markers. 

A. Selection for Embryonic Stem Cell Characteristics 

The successfully generated iPSCs from previous studies 
were remarkably similar to naturally-isolated pluripotent 
stem cells (such as mouse and human embryonic stem cells, 
mESCs and hESCs, respectively) in the following respects, 
thus confirming the identity, authenticity, and pluripotency 
of iPSCs to naturally-isolated pluripotent stem cells. Thus, 
induced pluripotent stem cells generated from the methods 
disclosed in this invention could be selected based on one or 
more of following embryonic stem cell characteristics. 

1. Cellular Biological Properties 

Morphology: 

iPSCs are morphologically similar to ESCs. Each cell 
may have round shape, large nucleolus and scant cytoplasm. 
Colonies of iPSCs could be also similar to that of ESCs. 
Human iPSCs form sharp-edged, flat, tightly-packed colo- 
nies similar to hESCs and mouse iPSCs form the colonies 
similar to mESCs, less flatter and more aggregated colonies 
than that of hESCs. 

Growth Properties: 

Doubling time and mitotic activity are cornerstones of 
ESCs, as stem cells must self-renew as part of their defini- 
tion. iPSCs could be mitotically active, actively self-renew- 
ing, proliferating, and dividing at a rate equal to ESCs. 

Stem Cell Markers: 

iPSCs may express cell surface antigenic markers 
expressed on ESCs. Human iPSCs expressed the markers 
specific to hESC, including, but not limited to, SSEA-3, 
SSEA-4, TRA-1-60, TRA-1-81, TRA-2-49/6E, and Nanog. 
Mouse iPSCs expressed SSEA-1 but not SSEA-3 nor SSEA- 
4, similarly to mESCs. 
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Stem Cell Genes: 

iPSCs may express genes expressed in undifferentiated 
ESCs, including Oct-3/4, Sox-2, Nanog, GDF3, REXI, 
FGF4, ESG1, DPPA2, DPPA4, and hTERT. 

Telomerase Activity: 

Telomerases are necessary to sustain cell division unre- 
stricted by the Hayflick limit of —50 cell divisions. hESCs 
express high telomerase activity to sustain self-renewal and 
proliferation, and iPSCs also demonstrate high telomerase 
activity and express hTERT (human telomerase reverse 
transcriptase), a necessary component in the telomerase 
protein complex. 

Pluripotency: 

iPSCs will be capable of differentiation in a fashion 
similar to ESCs into fully differentiated tissues. 

Neural Differentiation: 

iPSCs could be differentiated into neurons, expressing 
Blll-tubulin, tyrosine hydroxylase, AADC, DAT, ChAT, 
LMX1B, and MAP2. The presence of catecholamine-asso- 
ciated enzymes may indicate that iPSCs, like hESCs, may be 
differentiable into dopaminergic neurons. Stem cell-associ- 
ated genes will be downregulated after differentiation. 

Cardiac Differentiation: 

iPSCs could be differentiated into cardiomyocytes that 
spontaneously began beating. Cardiomyocytes expressed 
TnTc, MEF2C, MYL2A, MYHCB, and NKX2.5. Stem 
cell-associated genes will be downregulated after differen- 
tiation. 

Teratoma Formation: 

iPSCs injected into immunodeficient mice may sponta- 
neously formed teratomas after certain time, such as nine 
weeks. Teratomas are tumors of multiple lineages containing 
tissue derived from the three germ layers endoderm, meso- 
derm and ectoderm; this is unlike other tumors, which 
typically are of only one cell type. Teratoma formation is a 
landmark test for pluripotency. 

Embryoid Body: 

hESCs in culture spontaneously form ball-like embryo- 
like structures termed *embryoid bodies," which consist of 
a core of mitotically active and differentiating hESCs and a 
periphery of fully differentiated cells from all three germ 
layers. iPSCs may also form embryoid bodies and have 
peripheral differentiated cells. 

Blastocyst Injection: 

hESCs naturally reside within the inner cell mass (em- 
bryoblast) of blastocysts, and in the embryoblast, differen- 
tiate into the embryo while the blastocyst's shell (tropho- 
blast) differentiates into extraembryonic tissues. The hollow 
trophoblast is unable to form a living embryo, and thus it is 
necessary for the embryonic stem cells within the embryo- 
blast to differentiate and form the embryo. iPSCs injected by 
micropipette into a trophoblast to generate a blastocyst 
transferred to recipient females, may result in chimeric 
living mouse pups: mice with iPSC derivatives incorporated 
all across their bodies with 1096-90 and chimerism. 

1i. Epigenetic Reprogramming 

Promoter Demethylation: 

Methylation is the transfer of a methyl group to a DNA 
base, typically the transfer of a methyl group to a cytosine 
molecule in a CpG site (adjacent cytosine/guanine 
sequence). Widespread methylation of a gene interferes with 
expression by preventing the activity of expression proteins 
or recruiting enzymes that interfere with expression. Thus, 
methylation of a gene effectively silences it by preventing 
transcription. Promoters of pluripotency-associated genes, 
including Oct-3/4, Rex1, and Nanog, may be demethylated 
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in iPSCs, showing their promoter activity and the active 
promotion and expression of pluripotency-associated genes 
in iPSCs. 

Histone Demethylation: 

Histones are compacting proteins that are structurally 
localized to DNA sequences that can effect their activity 
through various chromatin-related modifications. H3 his- 
tones associated with Oct-3/4, Sox-2, and Nanog may be 
demethylated to activate the expression of Oct-3/4, Sox-2, 
and Nanog. 

B. Selection for Residue Free Feature 

Å reprogramming vector such as oriP-based plasmid in 
this invention will replicate extra-chromosomally and lose it 
presence in host cells after generations. However, an addi- 
tional selection step for progeny cells essentially free of 
exogenous vector elements may facilitate this process. For 
example, a sample of progeny cell may be extracted to test 
the presence or loss of exogenous vector elements as known 
in the art (Leight and Sugden, Molecular and Cellular 
Biology, 2001). 

Å reprogramming vector may further comprise a selection 
marker, more specifically, a negative selection marker, such 
a gene encoding a thymidine kinase to select for progeny 
cells essentially free of such a selection marker. The human 
herpes simplex virus thymidine kinase type 1 gene (HSVtk) 
acts as a conditional lethal marker in mammalian cells. The 
HSVtk-encoded enzyme is able to phosphorylate certain 
nucleoside analogs (e.g., ganciclovir, an antiherpetic drug), 
thus converting them to toxic DNA replication inhibitors. An 
alternative or complementary approach is to test the absence 
of exogenous genetic elements in progeny cells, using 
conventional methods, such as RT-PCR, PCR, FISH (Fluo- 
rescent in situ hybridization), gene array, or hybridization 
(e.g., Southern blot). 


VII. Culturing of iPS Cells 


After somatic cells are introduced with a reprogramming 
vector using the disclosed methods, these cells may be 
cultured in a medium sufficient to maintain the pluripotency. 
Culturing of induced pluripotent stem (1PS) cells generated 
in this invention can use various medium and techniques 
developed to culture primate pluripotent stem cells, more 
specially, embryonic stem cells, as described in U.S. Pat. 
App. 20070238170 and U.S. Pat. App. 20030211603. 

For example, like human embryonic stem (hES) cells, iPS 
cells can be maintained in 80% DMEM (Gibco #10829-018 
or #11965-092), 20% defined fetal bovine serum (FBS) not 
heat inactivated, 1% non-essential amino acids, 1 mM 
L-glutamine, and 0.1 mM .beta.-mercaptoethanol. Alterna- 
tively, ES cells can be maintained in serum-free medium, 
made with 80% Knock-Out DMEM (Gibco #10829-018), 
20% serum replacement (Gibco #10828-028), 1% non- 
essential amino acids, 1 mM L-glutamine, and 0.1 mM 
.beta.-mercaptoethanol. Just before use, human bFGF is 
added to a final concentration of about 4 ng/mL (WO 
99/20741). 

IPS cells, like ES cells, have characteristic antigens that 
can be identified by immunohistochemistry or flow cytom- 
etry, using antibodies for SSEA-1, SSEA-3 and SSEA-4 
(Developmental Studies Hybridoma Bank, National Insti- 
tute of Child Health and Human Development, Bethesda 
Md.) and TRA-1-60 and TRA-1-81 (Andrews et al, in 
Robertson E, ed. Teratocarcinomas and Embryonic Stem 
Cells. IRL Press, 207-246, 1987). Pluripotency of embry- 
onic stem cells can be confirmed by injecting approximately 
0.5-10 10 6 cells into the rear leg muscles of 8-12 week old 
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male SCID mice. Teratomas develop that demonstrate at 
least one cell type of each of the three germ layers. 


VIII. EXAMPLES 


The following examples are included to demonstrate 
preferred embodiments of the invention. It should be appre- 
ciated by those of skill in the art that the techniques 
disclosed in the examples which follow represent techniques 
discovered by the inventor to function well in the practice of 
the invention, and thus can be considered to constitute 
preferred modes for its practice. However, those of skill in 
the art should, in light of the present disclosure, appreciate 
that many changes can be made in the specific embodiments 
which are disclosed and still obtain a like or similar result 
without departing from the spirit and scope of the invention. 


Example 1 


Construction of a Residue-Free Reprogramming 
Plasmid 


The inventors construct a recipient backbone plasmid 
which contains the oriP sequence including DS and FR 
separated by approximately 1,000 base pairs derived from 
EBV (see Lindner and Sugden, 2007) and the abbreviated 
form of wild-type EBNA1 known as deltaUR1 (referred to 
as DomNeg2 in Kennedy et al. 2003) (FIG. 3). The plasmid 
is currently built to express deltaUR1 driven by the elon- 
gation factor la (EFla) promoter which also contains an 
intronic sequence to maximize the expression of deltaUR1. 
The backbone plasmid has been currently set up to include 
a selection marker for mammalian cells encoding resistance 
to Hygromycin; however, the choice of resistance marker 
remains flexible according to the sensitivity of the cell line 
in which the plasmid will be introduced. Similarly, the 
plasmid encodes drug-resistance for prokaryotic selection 
and, in this case, the plasmid encodes resistance to ampi- 
cillin. 

The inventors integrated a number of cassettes within the 
recipient plasmid described above that encode the genes 
required for and contribute to reprogramming cells to 
become pluripotent (i.e., iPS cells). One cassette encoded 
two genes essential for the reprogramming process, Sox-2 
and Oct-4 (FIG. 4). The inventors could use the phospho- 
glycerate kinase (PGK) promoter, cytomegalovirus imme- 
diate-early gene (CMV) promoter or SV40 promoter to drive 
the expression of Sox-2 and Oct-4 but this choice is subject 
to change depending on the efficiency of that expression. 
Optionally, they have also included the second intron from 
the human beta-globin gene to also maximize expression of 
the transcript. Both genes, therefore, were encoded by the 
same transcript while translation could be initiated from a 
canonical ATG for Sox-2 and an internal ribosome entry site 
(IRES) derived from the encephalomyocarditis virus for 
Oct-4. Similarly, other cassettes encoded a bicistronic tran- 
script encoding Nanog and Lin28 or encoded a bicistronic 
transcript encoding KIf4 and c-Myc (which were driven by 
the PGK promoter or any other suitable promoter) and 
separated by an IRES as well (FIG. 4). Variations of a 
multi-cistronic transcript comprising any two or more of 
Sox-2, Oct-4, Nanog, Lin28, K1f4, c-Myc, EBNA-1 could be 
used. 

There could be certain variations of the system to opti- 
mize its efficiency. Current literature indicates that Lin28 
may be dispensable for the reprogramming process and 
therefore it is likely that the inventors could adjust this 


US 9,644,184 B2 


35 


plasmid system to include only Sox-2, Oct-4, and Nanog. 
Furthermore, the type of IRES chosen has been proven 
functional albeit in the context of other gene sets. However, 
it is possible that the IRES may prove inadequate to promote 
the levels of expression required for proper reprogramming 
and may result in breaking the cassettes up such that each 
reprogramming gene will be driven by its own human 
promoter. 

In summary, the master shuttling plasmid or reprogram- 
ming plasmid could encode Sox-2, Oct-4, Nanog, and pos- 
sibly Lin28 (FIG. 5) while its replication and maintenance 
could be promoted by the presence of oriP and deltaUR1. 
This plasmid will also be poised for future modifications to 
include a negative selection marker such as thymidine 
kinase and an additional positive selection marker such as 
sequences encoding green or red fluorescent protein. 


Example 2 
Use of a Residue-Free Reprogramming Plasmid 


Successful reprogramming will depend on the efficient 
introduction of this large (15-20 kb) plasmid into mamma- 
lian cells. The inventors are currently employing a lypo- 
phyllic-based approach to introduce the DNA into human 
fibroblasts; however, this approach is likely to be modified 
according to the cell type being transfected. For example, 
they would likely chose electroporation for the introduction 
of DNA plasmids into hematopoietic cells. Once cells are 
properly transfected, the inventors will place these cells on 
a bed of irradiated mouse embryonic fibroblasts (MEFs) or 
matrigel on 10 cm culture dishes using media suitable for the 
transfected cells. Approximately six days following trans- 
fection, the media will be changed to media specialized for 
reprogramming cells and replaced daily to every other day 
(Yu et al, 2007). 

Based on our current method of generating iPS cells, the 
inventors will likely select colonies resembling stem cells 
around twenty days post-transfection and transfer them to 
MEFs or matrigel in 6 well culture plates while feeding daily 
or every other day with specialized media. Once sufficient 
expansion has taken place, clones will be karyotyped and 
tested for proper markers specific to stem cells. 

All of the methods disclosed and claimed herein can be 
made and executed without undue experimentation in light 
of the present disclosure. While the compositions and meth- 
ods of this invention have been described in terms of 
preferred embodiments, it will be apparent to those of skill 
in the art that variations may be applied to the methods and 
in the steps or in the sequence of steps of the method 
described herein without departing from the concept, spirit 
and scope of the invention. More specifically, it will be 
apparent that certain agents which are both chemically and 
physiologically related may be substituted for the agents 
described herein while the same or similar results would be 
achieved. All such similar substitutes and modifications 
apparent to those skilled in the art are deemed to be within 
the spirit, scope and concept of the invention as defined by 
the appended claims. 
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SEQUENCE LISTING 


«160» NUMBER OF SEQ ID NOS: 4 


«210» SEQ ID NO 1 

«211» LENGTH: 641 

«212» TYPE: PRT 

«213» ORGANISM: Human herpesvirus 4 


«400» SEQUENCE: 1 


Met Ser Asp Glu Gly Pro Gly Thr Gly Pro Gly Asn Gly Leu Gly Glu 
1 5 10 15 


Lys Gly Asp Thr Ser Gly Pro Glu Gly Ser Gly Gly Ser Gly Pro Gln 
20 25 30 


Arg Arg Gly Gly Asp Asn His Gly Arg Gly Arg Gly Arg Gly Arg Gly 
35 40 45 


Arg Gly Gly Gly Arg Pro Gly Ala Pro Gly Gly Ser Gly Ser Gly Pro 
50 55 60 


Arg His Arg Asp Gly Val Arg Arg Pro Gln Lys Arg Pro Ser Cys Ile 
65 70 75 80 


Gly Cys Lys Gly Thr His Gly Gly Thr Gly Ala Gly Ala Gly Ala Gly 
85 90 95 


Gly Ala Gly Ala Gly Gly Ala Gly Ala Gly Gly Gly Ala Gly Ala Gly 
100 105 110 


Gly Gly Ala Gly Gly Ala Gly Gly Ala Gly Gly Ala Gly Ala Gly Gly 
115 120 125 


Gly Ala Gly Ala Gly Gly Gly Ala Gly Gly Ala Gly Gly Ala Gly Ala 
130 135 140 


Gly Gly Gly Ala Gly Ala Gly Gly Gly Ala Gly Gly Ala Gly Ala Gly 
145 150 155 160 


Gly Gly Ala Gly Gly Ala Gly Gly Ala Gly Ala Gly Gly Gly Ala Gly 
165 170 175 


Ala Gly Gly Gly Ala Gly Gly Ala Gly Ala Gly Gly Gly Ala Gly Gly 
180 185 190 


Ala Gly Gly Ala Gly Ala Gly Gly Gly Ala Gly Ala Gly Gly Ala Gly 
195 200 205 


Gly Ala Gly Gly Ala Gly Ala Gly Gly Ala Gly Ala Gly Gly Gly Ala 
210 215 220 


Gly Gly Ala Gly Gly Ala Gly Ala Gly Gly Ala Gly Ala Gly Gly Ala 
225 230 235 240 


Gly Ala Gly Gly Ala Gly Ala Gly Gly Ala Gly Gly Ala Gly Ala Gly 
245 250 255 


Gly Ala Gly Gly Ala Gly Ala Gly Gly Ala Gly Gly Ala Gly Ala Gly 
260 265 270 


Gly Gly Ala Gly Gly Ala Gly Ala Gly Gly Gly Ala Gly Gly Ala Gly 
275 280 285 


Ala Gly Gly Ala Gly Gly Ala Gly Ala Gly Gly Ala Gly Gly Ala Gly 
290 295 300 


Ala Gly Gly Ala Gly Gly Ala Gly Ala Gly Gly Gly Ala Gly Ala Gly 
305 310 315 320 


Gly Ala Gly Ala Gly Gly Gly Gly Arg Gly Arg Gly Gly Ser Gly Gly 
325 330 335 


Arg Gly Arg Gly Gly Ser Gly Gly Arg Gly Arg Gly Gly Ser Gly Gly 
340 345 350 


Arg Arg Gly Arg Gly Arg Glu Arg Ala Arg Gly Gly Ser Arg Glu Arg 
355 360 365 


Ala Arg Gly 
370 


Ser Ser Gin 
385 


Gly Arg Arg 


Tyr His Gln 


Ala Ile Glu 


435 


Gly Pro Arg 
450 


Phe Gly Lys 
465 


Ile Ala Glu 


Thr Thr Asp 


Ser Lys Thr 


515 


Pro Gln Cys 
530 


Pro Gly Pro 
545 


Tyr Phe Met 


Asp Ala Ile 


Ile Arg Val 


595 


Trp Phe Pro 
610 


Asp Asp Gly 
625 


Glu 


«210» SEQ I 
«211» LENGT 
«212» TYPE: 
«213» ORGAN 


41 


Arg Gly Arg Gly Arg 


375 


Ser Ser Ser Ser Gly 
390 


Pro Phe Phe His Pro 


405 


Glu Gly Gly Pro Asp 


420 


Gln Gly Pro Ala Asp 


440 


Gly Gln Gly Asp Gly 


455 


His Arg Gly Gln Gly 
470 


Gly Leu Arg Ala Leu 


485 


Glu Gly Thr Trp Val 


500 


Ser Leu Tyr Asn Leu 


520 


Arg Leu Thr Pro Leu 


535 


Gly Pro Gln Pro Gly 
550 


Val Phe Leu Gln Thr 


565 


Lys Asp Leu Val Met 


580 


Thr Val Cys Ser Phe 


600 


Pro Met Val Glu Gly 


615 


Asp Glu Gly Gly Asp 
630 


D NO 2 

H: 1926 
DNA 

ISM: Human 


<400> SEQUENCE: 2 


atgtctgacg 


tctggaccag 


egaggacggg 


ggatcagggc 


ggctgcaaag 


ggaggggeag 


gcaggagggg 


ggggcaggag 


ggaggggcag 


aggggccagg 


aaggctccgg 


gaagaggacg 


caagacatag 


ggacccacgg 


gagcaggagg 


caggagcagg 


caggaggagg 


gaggggcagg 


herpesviru 


tacaggacct 
cggcagtgga 
aggacgagga 
agatggtgtc 
tggaacagga 
aggggcagga 
aggaggggca 
ggcaggagca 


aggggcagga 


Gly Glu Lys 
Ser Pro Pro 
395 


Val Gly Glu 
410 


Gly Glu Pro 
425 


Asp Pro Gly 


Gly Arg Arg 


Gly Ser Asn 
475 


Leu Ala Arg 
490 


Ala Gly Val 
505 


Arg Arg Gly 


Ser Arg Leu 


Pro Leu Arg 
555 


His Ile Phe 
570 


Thr Lys Pro 
585 


Asp Asp Gly 


Ala Ala Ala 


Gly Asp Glu 
635 


s 4 


ggaaatggcc 


cctcaaagaa 
ggcggaagac 
cggagacccc 
gcaggagcag 
gcaggaggag 
ggagcaggag 
ggaggagggg 


gcaggaggag 


US 9,644,184 B2 


-continued 


Arg Pro Arg Ser Pro 


380 


Arg Arg 


Ala Asp 


Asp Val 


Glu Gly 
445 


Lys Lys 
460 


Pro Lys 


Ser His 


Phe Val 


Thr Ala 
525 


Pro Phe 
540 


Glu Ser 


Ala Glu 


Ala Pro 


Val Asp 
605 


Glu Gly 
620 


Gly Glu 


Pro 


Tyr 


Pro 
430 


Pro 


Pro Pro 
400 


Phe Glu 
415 


Pro Gly 


Ser Thr 


Gly Gly Trp 


Phe 


Val 


Tyr 
510 


Glu Asn 
480 


Glu Arg 
495 


Gly Gly 


Leu Ala Ile 


Gly Met Ala 


Ile 


Val 


Thr 
590 


Val Cys 
560 


Leu Lys 
575 


Cys Asn 


Leu Pro Pro 


Asp 


Asp Gly 


Glu Gly Gln 


taggagagaa 


gagggggtga 


caggagcccc 


aaaaacgtcc 


gagegggagg 


gggcaggagg 


gaggggcagg 


caggaggggc 


gggcaggagc 


640 


gggagacaca 


taaccatgga 


gggcggctca 


aagttgcatt 


ggcaggagca 


ggcaggaggg 


aggggcagga 


aggagcagga 


aggaggaggg 


60 


120 


180 


240 


300 


360 


420 


480 


540 


42 


gcaggagggg 
ggggeaggag 
ggaggagggg 
ggagcaggag 
gcaggagcag 
ggaggagggg 
ggaggggeag 
ggggeaggag 
ggtagtggag 
gecagggggg 
cccaggagtc 
ggtagaaggc 
ggtggeecag 
gacccaggag 
aaaggagggt 
attgcagaag 
ggaacttggg 
aggcgaggaa 
tttggaatgg 
tatttcatgg 
gaccttgtta 
gacgatggag 
ggtgatgacg 


gagtga 


caggagcagg 
caggaggggc 
caggaggggc 
gggeaggage 
gaggggcagg 
caggaggggc 
gagcaggagg 
caggaggtgg 
gecggggt cg 
gaagtcgtga 
ccagtagtca 
catttttcca 
atggtgagcc 
aaggcccaag 
ggtttggaaa 
gtttaagagc 
tegeeggtgt 
etgeccttge 
ccectggacc 
tctttttaca 
tgacaaagcc 


tagatttgcc 


gagatgacgg 


«210» SEQ ID NO 3 


«211» LENGT 
«212» TYPE: 


«213» ORGANISM: Human 


H: 
PRT 


«400» SEQUENCE: 


Met Ser Asp 
i 


Lys Gly Asp 


Arg Arg Gly 


Arg Gly Gly 


50 
Gly Ala Gly 
65 
Gly Arg Gly 


Gly Arg Gly 


Arg Gly G 


Ul 


Glu 


Thr 


392 


3 


Asp 


Arg 


Gly 


Ser 


85 


Ser 


Arg 


Pro 


Gly 


Asn 


Pro 


Ala 


70 


Gly 


Gly 


Glu 


43 


Gly 
Pro 
His 
Gly 
55 

Gly 
Gly 


Gly 


Arg 


aggaggggca 
aggaggggca 
aggaggggca 
aggaggggca 
aggggcagga 
aggagcagga 
ggcaggaggg 
aggccggggt 
aggaggtagt 
aagagccagg 
gtcatcatca 
ccctgtaggg 
tgacgtgece 
cactggaccc 
gcatcgtggt 
tctcetggct 
gttcgtatat 
tattccacaa 
cggeccacaa 
aactcatata 
cgetcctacc 
teectggttt 


agatgaagga 


Thr 


Glu 


Gly 


40 


Ala 


Gly 


Arg 


Arg 


Ala 
120 


ggaggggeag 
ggaggggcag 
ggagcaggag 
ggaggggeag 
gcaggaggag 
ggggeaggag 
gcaggagcag 
cgaggaggca 
ggaggccgcc 
gggagaggte 
tcegggtctc 
gaagccgatt 
cegggagcga 
eggggtcagg 
caaggaggtt 
aggagtcacg 
ggaggtagta 
tgtcgtctta 
cetggccegc 
tttgctgagg 
tgcaatatca 


ccacctatgg 


ggtgatggag 


herpesvirus 4 


Gly Pro Gly 
10 


Gly Ser Gly 
25 


Arg Gly Arg 


Pro Gly Gly 


Ala Gly Ala 
75 


Gly Arg Gly 
90 


Arg Gly Arg 
105 


Arg Gly Arg 
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-continued 


gaggggeagg agcaggagga 


gagcaggagg ggcaggagca 


gggcaggagc aggaggggca 


gagcaggagg ggcaggaggg 


gggcaggagg ggcaggagca 


gggcaggagc aggaggggca 


gaggagggge aggagcagga 


gtggaggccg gggtcgagga 


ggggt agagg acgtgaaaga 


gtggacgtgg agaaaagagg 


caccgegcag geccccctcca 


attttgaata ccaccaagaa 


tagagcaggg ccccgcagat 


gtgatggagg caggcgcaaa 


ccaacccgaa atttgagaac 


tagaaaggac taccgacgaa 


agacctccct ttacaaccta 


caccattgag tcgtctcccc 


taagggagtc cattgtctgt 


ttttgaagga tgcgattaag 


gggtgactgt gtgcagcttt 


tggaaggggc tgccgcggag 


atgagggtga ggaagggcag 


Asn 


Gly 


Gly 


Ser 


60 


Gly 


Gly 


Gly 


Gly 


Gly 


Ser 


Arg 


45 


Gly 


Gly 


Ser 


Arg 


Arg 
125 


Leu Gly Glu 
15 


Gly Pro Gln 


30 


Gly Arg Gly 


Ser Gly Ala 


Gly Gly Arg 


Gly Gly Arg 


Glu Arg Ala 


110 


Gly Arg Gly 


600 


660 


720 


780 


840 


900 


960 


1020 


1080 


1140 


1200 


1260 


1320 


1380 


1440 


1500 


1560 


1620 


1680 


1740 


1800 


1860 


1920 


1926 


44 


Glu 


Pro 


145 


Gly 


Glu 


Pro 


Arg 


Ser 


225 


Ala 


Gly 


Arg 


Arg 


Leu 


305 


Ile 


Lys 


Asp 


Ala 


Asp 
385 


«210» 
«211» 
«212» 
«213» 


«400» 


Lys 


130 


Pro 


Glu 


Pro 


Gly 


Arg 


210 


Asn 


Arg 


Val 


Gly 


Leu 


290 


Arg 


Phe 


Pro 


Gly 


Ala 


370 


Glu 


Arg 


Arg 


Ala 


Asp 


Glu 


195 


Lys 


Pro 


Ser 


Phe 


Thr 


275 


Pro 


Glu 


Ala 


Ala 


Val 


355 


Glu 


Gly 


45 


Pro Arg Ser Pro Ser 


135 


Arg Pro Pro Pro Gly 
150 


Asp Tyr Phe Glu Tyr 


165 


Val Pro Pro Gly Ala 


180 


Gly Pro Ser Thr Gly 


200 


Lys Gly Gly Trp Phe 


215 


Lys Phe Glu Asn Ile 
230 


His Val Glu Arg Thr 


245 


Val Tyr Gly Gly Ser 


260 


Ala Leu Ala Ile Pro 


280 


Phe Gly Met Ala Pro 


295 


Ser Ile Val Cys Tyr 
310 


Glu Val Leu Lys Asp 


325 


Pro Thr Cys Asn Ile 


340 


Asp Leu Pro Pro Trp 


360 


Gly Asp Asp Gly Asp 


375 


Glu Glu Gly Gln Glu 
390 


SEQ ID NO 4 
LENGTH: 1179 
TYPE: DNA 
ORGANISM: Human 


SEQUENCE: 4 


atgtctgacg aggggccagg 


tctggaccag 


egaggacggg 


ggatcagggg 


ggtegaggag 


agtggaggcc 


agggggagag 


tcatccgggt 


ggggaagerg 


cccccgggag 


aaggctccgg 


gaagaggacg 


ccggcgcagg 


gcagtggagg 


gecggggt ag 


gtcgtggacg 


ctccaccgcg 


attattttga 


cgatagagca 


herpesviru 


tacaggacct 
cggcagtgga 
aggacgagga 
agcaggagcg 
ceggggtcga 
aggacgtgaa 
tggagaaaag 
caggccccct 


ataccaccaa 


gggececgca 


Ser Gln Ser 
Arg Arg Pro 
155 


His Gln Glu 


Ile Glu Gln 
185 


Pro Arg Gly 


Gly Lys His 


Ala Glu Gly 


235 


Thr Asp Glu 
250 


Lys Thr Ser 
265 


Gln Cys Arg 


Gly Pro Gly 


Phe Met Val 
315 


Ala Ile Lys 
330 


Arg Val Thr 
345 


Phe Pro Pro 


Asp Gly Asp 


s 4 


ggaaatggcc 


cctcaaagaa 
ggcggaagac 
ggaggggcag 
ggaggtagtg 
agagccaggg 
aggcccagga 
ccaggtagaa 
gaaggtggcc 


gatgacccag 
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-continued 


Ser Ser Ser 
140 


Phe Phe His 


Gly Gly Pro 


Gly Pro Ala 
190 


Gln Gly Asp 
205 


Gly Ser 


Pro Val 
160 


Asp Gly 
175 


Asp Asp 


Gly Gly 


Arg Gly Gln Gly Gly 


220 


Leu Arg Ala 


Gly Thr Trp 


Leu Leu 
240 


Val Ala 
255 


Leu Tyr Asn Leu Arg 


270 


Leu Thr Pro 
285 


Pro Gln Pro 
300 


Leu Ser 


Gly Pro 


Phe Leu Gln Thr His 


Asp Leu Val 


Val Cys Ser 
350 


320 


Met Thr 
335 


Phe Asp 


Met Val Glu Gly Ala 


365 


Glu Gly Gly Asp Gly 


380 


taggagagaa 


gagggggtga 


caggagcccc 


gagcaggagg 


gaggecgggg 


ggggaagtcg 


gtcccagtag 


ggecattttt 


cagatggtga 


gagaaggccc 


gggagacaca 


taaccatgga 


gggcggctca 


tggaggecgg 


tcgaggaggt 


tgaaagagcc 


tcagtcatca 


ccaccctgta 


gectgacgtg 


aagcactgga 


60 


120 


180 


240 


300 


360 


420 


480 


540 


600 


46 
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47 48 
-continued 
ccccggggtc agggtgatgg aggcaggcgc aaaaaaggag ggtggtttgg aaagcatcgt 660 
ggtcaaggag gttccaaccc gaaatttgag aacattgcag aaggtttaag agctctcctg 720 
gctaggagtc acgtagaaag gactaccgac gaaggaactt gggtcgccgg tgtgttcgta 780 
tatggaggta gtaagacctc cctttacaac ctaaggcgag gaactgccct tgctattcca 840 
caatgtcgtc ttacaccatt gagtcgtctc ccctttggaa tggcccctgg acccggccca 900 
caacctggcc cgctaaggga gtccattgtc tgttatttca tggtcttttt acaaactcat 960 
atatttgctg aggttttgaa ggatgcgatt aaggaccttg ttatgacaaa gcccgctcct 1020 
acctgcaata tcagggtgac tgtgtgcagc tttgacgatg gagtagattt gcctccctgg 1080 
tttccaccta tggtggaagg ggctgccgcg gagggtgatg acggagatga cggagatgaa 1140 
ggaggtgatg gagatgaggg tgaggaaggg caggagtga 1179 
20 


What is claimed is: 

1. An isolated population of human induced pluripotent 
stem (iPS) cells, said population comprising human pluri- 
potent cells that comprise Epstein-Barr virus (EBV)-based 
reprogramming vectors comprising a replication origin and 
one or more expression cassettes encoding iPS reprogram- 
ming factors comprising at least Sox-2, Oct-4, wherein said 
cells, or one or more of said expression cassettes, comprise 
a nucleotide sequence encoding a trans-acting factor that 
binds to the replication origin to replicate an extra-chromo- 
somal template, the cell population comprising the genome 
of a selected adult human individual and essentially free of 
chromosomally integrated retroviral elemants. 

2. The isolated human iPS cell population of claim 1, 
wherein the genome is that of a terminally differentiated 
human cell. 

3. The isolated population of claim 1, wherein the 
Epstein-Barr virus (EBV)-based vectors encode at least the 
reprogramming factors Oct4 and Sox2, and additionally one 
or more of the reprogramming factors selected from the 
group consisting of Lin28, Nanog, Klf4 and c-Myc. 

4. The isolated population of claim 1, wherein the popu- 
lation further comprises human pluripotent cells that are 
essentially free of Epstein-Barr (EBV)-based vectors. 

5. An isolated population of human induced pluripotent 
stem (iPS) cells, said population comprising human pluri- 
potent cells that comprise Epstein-Barr virus (EBV)-based 
reprogramming vectors comprising a replication origin and 


25 


30 


35 


40 


45 


one or more expression cassettes encoding at least the 
reprogramming factors Oct4 and Sox2, and additionally one 
or more of the reprogramming factors selected from the 
group consisting of Lin28, Nanog, K1f4 and c-Myc, wherein 
said cells, or one or more of said expression cassettes, 
comprise a nucleotide sequence encoding a trans-acting 
factor that binds to the replication origin to replicate an 
extra-chromosomal template, wherein the iPS cells of the 
population have been produced from somatic cells of a 
selected human individual and comprise the genome of that 
individual and are essentially free of chromosomally inte- 
grated retroviral elements. 

6. The isolated population of claim 5, wherein the popu- 
lation further comprises human pluripotent cells that are 
essentially free of Epstein-Barr (EBV)-based vectors. 

7. The isolated population of claim 4 or 6, wherein the 
human pluripotent cells that comprise Epstein-Barr (EBV)- 
based vectors constitute less than 1096 of said population. 

8. The isolated population of claim 7, wherein the human 
pluripotent cells that comprise Epstein-Barr (EBV)-based 
vectors constitute less than 196 of said population. 

9. The isolated population of claim 8, wherein the human 
pluripotent cells that comprise Epstein-Barr (EBV)-based 
vectors constitute less than 0.596 of said population. 

10. The isolated population of claim 9, wherein the human 
pluripotent cells that comprise Epstein-Barr (EBV)-based 
vectors constitute less than 0.196 of said population. 


* * * * * 


